In April 2025, we analyzed 900,000 newly created web pages and discovered that 74.2% contained AI-generated content.
With the rapid growth of generative AI, businesses, educators, and publishers are asking a critical question: how can we tell what’s written by humans and what’s produced by machines?
The answer: it’s possible, but not foolproof. Here’s how to approach AI detection effectively, the limitations you need to understand, and a better way to get more reliable results.
Learn more in our study: 74% of New Webpages Include AI Content (Study of 900k Pages)
Good detectors don’t just give you a single yes-or-no verdict: they also break the text down and show you the likelihood that different passages are AI-generated, provide an overall article-level likelihood score, and in some cases even attempt to identify which models (such as GPT-4o) were likely used to create the content.
We ran a small-scale test comparing several of the most popular AI detectors to see how they perform in practice. The table below shows our results:
Based on my testing, Ahrefs’ AI detector and Copyleaks were the best-performing AI detectors, with GPTZero and Originality.ai close behind. At the other end of the scale, Grammarly and Writer performed the worst in my testing.
AI content detector | Score |
---|---|
Ahrefs | 13/18 |
Copyleaks | 13/18 |
GPTZero | 12/18 |
Originality.ai | 12/18 |
Scribbr | 10/18 |
ZeroGPT | 9/18 |
Grammarly | 6/18 |
Writer | 4/18 |
Learn more in my full write-up: The 8 Best AI Detectors, Tested and Compared
Like LLMs, AI detectors are probabilistic—they estimate likelihood, not certainty. They can be highly accurate, but false positives are inevitable. That’s why you shouldn’t base decisions on a single result. Run multiple checks, look for patterns, and combine findings with other evidence.
All AI detectors share the same fundamental limitations, regardless of the tool or technology used.
- Heavily edited or “humanized” AI text may evade detection. “Post-processing’ (things like rephrasing sentences, swapping in synonyms, rearranging paragraphs, or running the text through a grammar checker) can disrupt the statistical signals that detectors look for, reducing their accuracy.
- Basic detectors may lack accuracy and advanced features. Detection tools require frequent updates to stay ahead of new AI models—generative AI evolves quickly, and detectors need regular retraining to recognize the latest writing styles and avoidance techniques. At Ahrefs, our detector supports multiple leading models, including models from OpenAI, Anthropic, Meta, Mixtral, and Qwen, so you can check content against a broader range of likely sources.
- Effectiveness varies by language, content type, and model. Detectors trained primarily on English prose may struggle with technical writing, poetry, or less common languages.
- Ambiguous cases (like human text edited by AI) can blur results. These hybrid workflows create mixed signals that can confuse even advanced systems.
- Even the best tools can produce false positives or negatives. Statistical detection is never infallible, and occasional misclassifications are inevitable because the patterns these systems rely on can overlap between human and AI writing, and subtle edits or atypical writing styles can easily blur the distinctions.
Remember: false accusations based on incorrect AI detection results can seriously damage the reputation of individuals, companies, or academic institutions.
With these limitations in mind, it’s a good idea to corroborate any detector output with additional methods before drawing conclusions.
Human judgment can be extremely helpful for adding context to results from AI detectors. By examining context—such as patterns across multiple articles, a history of posts on social media, or the surrounding circumstances of publication—you can better gauge the likelihood that AI was involved in the writing.
Signs to look for:
- Overly consistent tone with no subtle quirks. Human writing is inherently a bit messy and unpredictable, with small variations in style, rhythm, and word choice that reflect personality and context. AI-generated text can sometimes lack these imperfections, producing a uniform tone that feels slightly too polished or mechanical.
- Verbosity. AI is very good at stretching simple ideas into long-winded explanations.
- Lack of new information. AI outputs often read as generic or surface-level (this is particularly obvious on LinkedIn: many AI-generated comments simply restate the original author’s idea in new words without adding any meaningful perspective or value).
- Tell-tale word choices. AI has a preference for slightly “off” idioms like “ever-evolving landscape”, formulaic hooks (“This isn’t X… it’s Y”), or overuse of em dashes and emojis.
- Incentives. Is there a clear motivation for the author to use AI content?

I see you, ChatGPT.
None of these signs offer definitive evidence for AI content, but they can add helpful context to other forms of evidence.
If you run an AI detector on just one article, an unreliable result can be problematic. But that issue becomes less important when you look at results at scale. Running this process across many pages gives you a much clearer picture of how AI is used as part of the company’s broader marketing strategy.
With Ahrefs’ Top pages report in Site Explorer, you can see an “AI Content Level” column for almost any website page. From there, you can even inspect any individual URL and get an idea of the AI models that were likely used in the page’s creation.
Here’s a video talking through this process:
For a quick tip: use this report to spot top-ranking, heavily AI-generated content and consider creating your own AI version. If it’s ranking, it’s meeting search intent—making it a potential opportunity for you, and your AI content workflow.
Further reading
Similar Posts
More Videos Added to Module Developer Guide Tutorials
This week, we’ve added 7 new videos to existing tutorials in our Module Developer Guide. The Module Developer Guide was created for developers familiar with PHP but new to Drupal module development. Updated tutorials — now with video! Here’s a list of tutorials that now have videos embedded. Select the + icon to add the…
SEO “Chunk Optimization” is Overrated
“Chunk optimization” is the latest shiny object in SEO, but it’s not the shortcut to visibility in AI search that many think it is. Across LinkedIn threads and AI‑SEO guides, the promise is the same: format your content into perfect “chunks” and you’ll get chosen for Google’s AI Overviews or cited in AI search results. The…
Drush Custom Command Tutorials Updated
Last week we updated most of the Drupal Plugin API content to recommend, and demonstrate, using PHP attributes instead of annotations. This week I’m back with another update doing the same thing for our content on creating custom Drush commands. Which now also use PHP attributes instead of annotations. The big changes to this set…
SEO Is the Worst It’s Ever Been (And It’s Still Your Best Marketing Channel)
Here are two ideas that can be true at the same time: SEO offers worse returns than it used to. SEO is still one of the best marketing channels. SEO really has changed: Talk to website owners, and you’ll even hear whispers that some companies are losing 20–40% of their monthly clicks from search. 1…
Why .store Is the Perfect Domain Extension for Your Online Shop
In the vast, ever-evolving eCommerce landscape, filled to the brim with new digital marketplaces, standing out is vital for online shop owners. And you shouldn’t underestimate the impact your top-level domain can have on your brand’s online presence. Over the last 12 months, the .store registry commissioned Contrast Digital to run a TLD experiment to…
How can ChatGPT Affect Domain Names?
Changes you know about – ChatGPT, the chatbot, founded by OpenAI that has caught the fancy of everyone from developers, marketers, AI experts, and businesses. Now, there are a lot of predictions about how this savvy little bot will change the way we work. Given our love for domains, we thought we could take a…