Brewing...

Skip to content

LLMs: The Digital Parasites & The Gluttons

Tech Talks

Published on 5 October 2025

Graph from official Cloudflare blogpost, detailing the parasitic nature of LLMs in 2025.

Data straight from Cloudflare. Q2-Q3 2025.

Ever wondered who the biggest parasite in the AI world is? It's not OpenAI. A new report exposes how Anthropic's parasitic behaviour is draining the web, while Meta's strip-mining tactics consume it. A look at the data and the grim future it predicts for online creators.

https://blog.cloudflare.com/ai-search-crawl-refer-ratio-on-radar/

https://blog.cloudflare.com/control-content-use-for-ai-training/

https://blog.cloudflare.com/ai-crawler-traffic-by-purpose-and-industry/

The Parasites:

I’ve been wanting to write this post ever since Cloudflare dropped its latest report on AI web crawlers, confirming a truth many of us have felt for a while: the relationship between AI companies and the open web is fundamentally broken. It's a purely extractive relationship. I'll only be talking a bit about this part; the reports I linked are far more detailed, and they are worth a read as well.

To be clear, when I talk about parasitic behaviour in this context, I’m referring to the death of the web’s old grand bargain. For decades, the deal was simple: publishers created content, and search engines crawled it in exchange for sending referral traffic back. It was a loop of value. Now, AI crawlers ingest that same content not to refer, but to replace. They scrape the web’s knowledge to provide answers directly within their own closed interface, cutting the original creators out of the loop entirely. They take the value and give almost nothing back. That’s digital parasitism.

You’d think the biggest offender would be the biggest name, right? OpenAI, with its massive scale, must surely be the worst parasite. But, to the surprise of absolutely no one who’s been paying attention, the crown for the most parasitic of all generative AI companies goes to the self-proclaimed "ethical" B Corp: Anthropic.

Anthropic: The Apex Parasite

Anthropic’s behaviour is in a league of its own. The key metric here is the "crawl-to-refer ratio," which measures how many pages a bot scrapes for every single human visitor it sends back. The numbers for Anthropic are mind-blowing.

In January 2025, their ratio was an almost unbelievable 286,930:1. While that has since "improved," recent reports still peg them with ratios between 50,000:1 and 73,000:1. To put that in perspective, OpenAI’s ratio hovers around 880:1 to 1,700:1, and Google’s is around 9:1. At their known peak, they were more than 100x the ratio of the biggest genAI company in the world. It is an absolutely insane situation.

This isn't a new problem, and like many of their other questionable behaviours, they don’t stop when called out. Just ask iFixit, who reported their servers were hammered with nearly a million requests from ClaudeBot in a single 24-hour period, in direct violation of their terms of service. Anthropic’s response? A shrug and a link to their FAQ, telling them to use a robots.txt file to opt-out. This has been a consistent pattern, with system administrators across the web describing ClaudeBot’s activity as so aggressive it resembles a DDoS attack.

There are absolutely tangible costs to these behaviours. The open-source project Read the Docs reported saving $1,500 per month in bandwidth costs after blocking AI crawlers. It’s a forced subsidy from creators to a multi-billion-dollar corporation. More importantly, it erodes the entire economic model of the web. No clicks mean no ad revenue, no subscriptions, and no brand visibility for the people who actually create the information.

Meta: The Digital Strip-Miner

On the other side of this grimy coin, we have another tech giant. A company spending billions on a generative AI strategy that seems to have no clear direction, churning out bottom-of-the-barrel products while its peers pull ahead. No, not Microsoft silly, I’m talking about Meta!

If Anthropic is the parasite, quietly draining the lifeblood of its host with an impossibly imbalanced exchange, then Meta is the digital strip-miner. It’s less about finesse and all about brute-force, overwhelming volume.

Since last year, as Meta scrambled to build a "superintelligence" team, its data hunger has accelerated into a frenzy of web scraping. The numbers are, once again, courtesy of the web’s watchdogs. According to a report from Fastly, Meta’s AI crawlers are the most dominant on the web by a massive margin, accounting for 52% of all AI crawler traffic—more than double the traffic from Google (23%) and OpenAI (20%) combined. In the span of just one year, from July 2024 to July 2025, raw requests from its Meta-ExternalAgent bot exploded by 843%.

This is a torrential deluge. This kind of high-volume scraping overwhelms servers, consumes vast amounts of bandwidth, and can mimic the effects of a DDoS attack, even if unintentional. It also pollutes analytics, with one report noting that AI scrapers contributed to an 86% year-over-year increase in general invalid traffic, making it harder for businesses to understand their real human audience.

And for what? After strip-mining half the web’s AI-related bot traffic, what does Meta have to show for it? A suite of AI products that are widely regarded as lagging behind the competition. They’ve consumed immense resources, strained the web’s infrastructure, and devalued content, all while failing to produce anything of significant worth. It’s the digital equivalent of leveling a rainforest to produce a single toothpick.

The web is being assaulted from two fronts: the insidious, imbalanced extraction of the parasite and the overwhelming, brute-force consumption of the strip-miner. Both are unsustainable, and both are destroying the ecosystem they claim to be learning from.

âś… The Verdict

So, while Anthropic is busy hyping up its recently released Claude 4.5 with benchmarks and pretentious nonsense, remember the foundation it’s built on. Remember the insane crawl ratios, the disregard for terms of service, and the host of other downright questionable and illegal behaviours they've been consistently pulling over the years.

And as Meta continues to spend billions and flood the web with a digital tsunami of requests while producing nothing of note, we need to look at the bigger picture.

We are witnessing the systematic erosion of the organic web's economic foundation. AI models are trained on a universe of human-generated content, yet they simultaneously destroy the reasons for its future creation.

The question isn't hypothetical. Where will a tool like Microsoft's Gaming Copilot find new strategies once creators stop producing them? This parasitic cycle, where value is taken without reciprocation, is unsustainable. Once these systems have consumed all existing information, they will have nothing new to learn from. The current path leads only to a sterile digital landscape, populated by AIs endlessly recycling stale data.

Share this post on:
Share this post on Bluesky Share this post via WhatsApp Share this post on Facebook Share this post on X Share this post via Telegram Share this post on Pinterest Share this post via email