There is definitely a fear that duplicate content in any form will hurt your website’s search engine rankings.
That fear is not unfounded.
Duplicate content is content that appears at more than one unique website address. While there is no official penalty for duplicate content, multiple instances of similar (or identical) content can confuse search engines that are trying to determine which version is the most relevant for any given search.
In fact, it is estimated that about 29 percent of all web content is duplicate content.
Search engines will rarely show multiple versions of the same content, so it’s important to understand how duplicate content can happen and what you should do about it.
How duplicate content happens
The good news is that most website owners are not intentionally creating duplicate content.
The bad news is that you could be doing it without even realizing it.
When we think of content, it’s easy to assume that we’re only referring to blog posts or other editorial content, and that duplicate content happens when scrapers republish your blog on their own sites. This can happen. However, “content” is all-encompassing and includes such things as product information as well.
Think about ecommerce: It’s common for many different websites to sell the same products. When they all use the manufacturer’s description, the number of duplicate content skyrockets. It’s a common issue for ecommerce websites.
URL parameters also can create duplicate content problems. This can include link tracking and other analytics tags. In addition, session IDs (where each website visitor is assigned a different identifying tag) can contribute to duplicate content.
Similarly, printer-friendly versions of your content can also cause multiple versions of the webpages to be indexed.
Another cause of duplicate content happens if you have separate versions of your website with and without the “www” prefix in the URL but the same identical content.
This can also happen with separate website versions at http:// and https://.
If both versions of your website are visible to search engines, you have inadvertent duplicate content on your hands.
What to do about duplicate content
The first step to fixing any duplicate content issues is deciding which of the versions is the correct one that you want properly indexed by search engines.
Once that is decided, be sure to canonicalize that content for search engines. There are a few ways to do this, including using:
- The rel=canonical attribute on your content webpage (which tells search engines that a specific page should be treated as though it were a copy of an identified URL, directing all SEO to that URL)
- A 301 redirect to the correct URL (which essentially redirects from the duplicate page to the original page)
- Google Search Console’s parameter handling tool (which enables you to set the preferred domain of your website and indicate whether Googlebot should crawl URL parameters any differently. Be aware that these changes will only reflect for Google, no other search engines.)
- A meta robots noindex tag (which can exclude any particular page from a search engine’s index. You still want to allow the crawling or else that’s a negative flag to search engines, particularly Google.)
In addition, be consistent when linking internally within your website (using or not using the “www” prefix, for example).
If you’re concerned about content scrapers, add a self-referring rel=canonical link to your existing webpages. While not all scrapers port over the full HTML code of their sources, some do, and using that canonical tag will help you get the credit for being the original source.
If you’re syndicating content, make sure that the syndicating website adds a link back to the original content (and not a variation of that link).
It’s important to keep duplicate content in perspective. With the exception of thousands of instances of duplicate content appearing all at the same time, the more common instances shouldn’t terribly impact your search ranking to any degree that requires an inordinate amount of time or resources spent on it.
Be diligent about avoiding the common mistakes listed above, which truly are just good SEO (search engine optimization) practices anyway, and always do what communicates most clearly to search engine crawlers. That’s the point of maintaining a strong search ranking.
Check out the 13 most common SEO mistakes you might be making right now.