Today I was reading Google's Webmaster Guideline regarding content duplication on the Internet and steps that Google advise webmasters to take care if the situation applies. It's quite interesting as I did not know some of these facts. Duplicated content has been a very common thing for a website, particularly when there are multiple pages with more and likely similar content created for different channels.
However, according Google guidelines, it is best not to have pages with completely similar content!
- Use Permanent Redirection (301s): The debate of whether to use permanent redirect or temporary redirect has been there for ages, but now we can confirm that Google prefers Permanent redirection. For web pages with similar or identical content, it is best to choose one and have it as the main, with other identical pages redirected permanent to it. Or what you can do is merge all pages into one page.
- Consistent URL string: This is an interesting one. I did not know Google cares about this, that is try keeping your internal linking consistent. Meaning, don't link to
http://www.yourwebsite.com/page/ and http://www.yourwebsite.com/page and http://www.yourwebsite.com/page/index.htm.
- Top Level Domain preferred: Google prefers top level domain, therefore it is better to avoid sub domain or domain alias which points to a top level web page with similar content.
- Break repetitive content into multiple pages with unique content: Google says, instead of including lengthy copyright text on the bottom of every page, include a very brief summary and then link to a page with more details. For example, if you have a travel site with separate pages for two cities, but the same information on both pages, you could either merge the pages into one page about both cities or you could expand each page to contain unique content about each city.
- Avoid empty web pages: Users don't like seeing "empty" pages, so avoid placeholders where possible. For example, don't publish pages for which you don't yet have real content. If you do create placeholder pages, use the noindex meta tag to block these pages from being indexed.
If you think this post is useful, please recommend me at the bottom of the page. ;)