TL;DR

Duplicate content means the same (or near-identical) content exists at multiple URLs. Google doesn't penalize it directly, but it forces Google to choose which version to rank — and it may pick the wrong one. Fix it with canonicals or 301 redirects.

Key Points

✓

Duplicate content is more often a technical accident (URL parameters, www vs non-www, HTTP vs HTTPS) than deliberate manipulation

✓

Google selects one 'canonical' version of duplicate content to index and rank — it may not choose the version you prefer

✓

Cross-site duplication (content scraped or syndicated from other sites) can cause Google to attribute the content to the wrong source

✓

Near-duplicate content (the same article on 50 location pages with only the city name swapped) is treated similarly to exact duplicates

Common Causes of Duplicate Content

Most duplicate content is unintentional^[1]. Common technical causes include: URL parameter variations (page.com/product?color=red and page.com/product?color=blue showing the same content), www vs non-www versions both being accessible, HTTP and HTTPS versions both resolving, printer-friendly page versions, and session ID parameters creating unique URLs for the same page. E-commerce sites are particularly vulnerable — a product available in multiple categories creates duplicate pages with different URL paths but identical content. CMS platforms that auto-generate tag, category, and archive pages can also create dozens of near-duplicate content collections. The solution for most technical duplicates is the canonical tag or a 301 Redirect.

How Google Handles Duplicates

Google groups duplicate pages into clusters and selects one as the canonical — the version it shows in search results^[1]^[2]. This selection uses signals including: which version has more backlinks, which uses HTTPS, which is specified in a XML Sitemap, and which version Google has crawled more. You can guide this choice with the `rel=canonical` tag, but Google treats it as a hint, not a directive. When canonical selection goes wrong, you may find an internal duplicate outranking your preferred landing page, or an old AMP version being served instead of your main page.

Fixing and Preventing Duplicate Content

The primary fixes for duplicate content are: (1) canonical tags — add `` to all duplicate versions, pointing to the main page; (2) 301 redirects — permanently redirect all variants to the canonical URL; (3) consistent internal linking — always link to the canonical URL throughout your site, never to parameter variants; (4) `robots.txt` or `noindex` for URL parameter pages that should never be indexed^[2]. For cross-site syndication, request that syndication partners add a canonical tag pointing back to your original. Run a Content Audit to find existing duplicates — crawl tools like Screaming Frog detect near-duplicate content by comparing content similarity scores across all your URLs.

SOURCES

Google Search Central — Consolidate Duplicate URLs

Moz — Duplicate Content Guide

Last updated: June 9, 2026

Related Terms

Canonical URL

An HTML tag that tells search engines which version of a page is the preferred, authoritative URL when multiple URLs serve the same or very similar content.

301 Redirect

An HTTP status code that permanently redirects one URL to another, telling browsers and search engines that the original page has moved and passing the majority of its link equity to the new destination.

Crawlability

The ability of search engine bots to access, navigate, and read the pages on your website without encountering technical barriers.

Content Audit

A systematic review of all existing content on a website to evaluate performance, identify gaps, and decide what to update, consolidate, or remove.

Put it into practice

Skribra automates your SEO content pipeline — from keyword research to published articles — so you can apply these concepts at scale.

Try Skribra Free