StrategyNovember 23, 202513 min read

The Age of AI Slop: How the Internet Is Becoming a Copy of a Copy

_The web is being flooded with low-quality, AI-generated content. This isn't just annoying—it threatens to degrade our information ecosystem and trap AI models ...

The Age of AI Slop: How the Internet Is Becoming a Copy of a Copy
P
Prajwal Paudyal, Phd
Editorial Team

The web is being flooded with low-quality, AI-generated content. This isn't just annoying—it threatens to degrade our information ecosystem and trap AI models in a loop of self-deception.

Summary

The internet is filling up with "AI slop"—low-quality, mass-produced content generated by algorithms for profit or influence. This phenomenon goes beyond mere annoyance, posing a fundamental threat to our digital world. It degrades the quality of search engines, erodes trust, and makes finding reliable information increasingly difficult. More insidiously, as AI models are trained on this growing ocean of synthetic data, they risk entering a cycle of degradation known as "model collapse," where they learn from their own flawed outputs, becoming less accurate and more biased over time. This article explores the economics driving AI slop, the technical danger of this recursive data loop, and the likely human response: a flight to trusted, curated sources, much like how we learned to manage email spam. Ultimately, the rise of AI slop forces us to reconsider how we value and seek out genuine human expertise in a world saturated with synthetic noise.

Key Takeaways; TLDR;

  • "AI slop" refers to the massive volume of low-quality, AI-generated content created cheaply for ad revenue or political manipulation.
  • The primary driver is economic: content farms can now produce infinite articles, recipes, and posts at near-zero marginal cost.
  • This flood of content degrades search engine results and makes it harder for users to find authentic, reliable information.
  • A critical long-term risk is "model collapse," where AIs trained on their own synthetic output become progressively less accurate and more detached from reality.
  • This process mirrors the early days of email spam, which clogged inboxes before we developed filters and learned to ignore junk.
  • The likely human adaptation will be a "flight to quality," where users increasingly rely on trusted brands, known experts, and curated platforms.
  • The value of human curation, expertise, and accountability becomes more pronounced in an information environment saturated with synthetic content.
  • Solving this problem isn't about banning AI content but developing better tools for verification and fostering stronger signals of trust and authority online.

The Dawn of the Slop-pocalypse

A startling, if difficult to verify, claim has begun circulating: that nearly half of all new articles published online are now generated by AI. While the precise number is debatable, the underlying trend is undeniable . The internet is experiencing a deluge of synthetic content, a phenomenon aptly dubbed "AI slop." This isn't the sophisticated, world-changing AI we were promised, but its mundane, messy cousin—endless streams of low-quality blog posts, soulless recipe articles, and fabricated news stories, all generated at near-zero cost.

This isn't just a new form of spam. It represents a fundamental shift in our information ecosystem. For decades, the web's value was built on a foundation of human knowledge, creativity, and communication. Now, that foundation is being diluted by a tidal wave of automated, often nonsensical, text. The immediate problem is obvious: it's harder to find what you're looking for. But the deeper, more systemic risks—to our collective knowledge and to the future of AI itself—are only just beginning to come into focus.

The Economics of Infinite Content

To understand why this is happening, one need only follow the money. The internet's advertising-based economy has always rewarded volume. More content means more pages to place ads on, which means more potential clicks and more revenue. Historically, the limiting factor was the time and effort required for a human to produce that content.

Generative AI removes that bottleneck entirely. An aspiring content entrepreneur no longer needs expertise in cooking, home repair, or travel. They can simply prompt a large language model (LLM) to generate a thousand articles on the topic, spin up a generic website, and fill it with ads. The goal isn't to create the single best recipe for lasagna; it's to create a hundred mediocre recipes that capture search traffic from a hundred different long-tail keywords.

The transcript's example of a recipe blog is perfect. A human-written recipe often includes a personal narrative—a story about a grandmother, a trip to Italy, a kitchen disaster—that signals authenticity and care. It tells the reader, "A person stood behind this, tested it, and valued it enough to share." AI slop mimics this form without the substance. It generates a plausible-sounding story about a fictional grandmother, attaches it to a scraped or synthesized recipe, and publishes it. The cost is negligible, and even a trickle of ad revenue, when multiplied across thousands of such pages, can become a meaningful income stream .

Beyond simple profit, this method can be weaponized for political purposes. State-sponsored actors and fringe groups can now generate vast networks of websites and social media accounts to promote a specific narrative, drown out dissenting opinions, or simply sow chaos and distrust . The watchdog organization NewsGuard has already identified over 1,200 "Unreliable AI-Generated News" websites operating with little to no human oversight .

The Ouroboros Problem: When AI Eats Itself

The most insidious consequence of AI slop is what it does to the AI models themselves. Today's powerful LLMs, like those from Google, OpenAI, and Anthropic, were trained by scraping colossal amounts of data from the open internet—a snapshot of human knowledge, language, and culture . But what happens when their web crawlers, sent out to gather new data for the next generation of models, start ingesting content created by their predecessors?

This leads to a degenerative feedback loop that researchers call "model collapse" . Imagine making a photocopy of a photocopy. Each iteration introduces small errors and loses a bit of detail, until the final copy is a blurry, distorted mess. When AI models train on data generated by other AIs, a similar degradation occurs.

The model starts to overfit on the stylistic tics and common errors of AI-generated text. The diversity of its knowledge shrinks, and its connection to the real world—the ground truth of human experience—weakens. A 2023 study from researchers at Cambridge and Edinburgh Universities warned that training on synthetic data causes models to "forget" the true underlying data distribution, eventually leading to a state where they produce content that is increasingly uniform and detached from reality .

This isn't a theoretical problem. As more of the web becomes AI slop, the pool of high-quality human data shrinks. AI developers will have to work much harder to curate clean training sets, a task that runs counter to the scale-at-all-costs approach that has dominated the field. Without careful curation, we risk creating generations of AI that are trained on the hollow echo of their own output—an information ecosystem of a snake eating its own tail.

The Spam Analogy: A Glimpse of the Future?

This situation may feel unprecedented, but we've faced a similar challenge before: email spam. In the late 1990s and early 2000s, inboxes were flooded with unsolicited, low-quality messages. For a time, it seemed like spam might render email unusable. But we adapted.

First, we developed technological filters. Gmail's priority inbox and sophisticated spam detectors became incredibly effective at isolating the junk from the messages we actually wanted to see. Second, and more importantly, we developed cognitive filters. We learned to recognize the tell-tale signs of spam—the suspicious links, the urgent tone, the generic greetings—and ignore them. We began to rely on a smaller circle of trusted senders.

A similar adaptation is likely to occur with web content. As search engine results become noisier and less reliable, users will naturally gravitate toward sources they know and trust .

  • Search engines will have to evolve, moving beyond simple keyword matching to prioritize signals of authority, expertise, and trustworthiness—what Google refers to as E-E-A-T (Experience, Expertise, Authoritativeness, and Trustworthiness) .
  • Users will become more discerning, relying on trusted brands, specific journalists, academic institutions, and expert communities rather than wading through the generic results of a broad search.
  • New tools may emerge, like browser plugins that flag AI-generated content or curated search engines that only index human-vetted sources.

The vast ocean of AI slop won't disappear, but it may become, like the spam folder, a part of the internet's background noise—a place most people rarely visit and have learned to ignore.

The Flight to Quality and the Enduring Value of the Human Touch

If the future of the web is a retreat into trusted enclaves, it highlights the enduring value of human curation. In a world of infinite, cheap content, the things that become valuable are effort, expertise, and accountability.

When a journalist puts their byline on an article, or a scientist publishes a paper under their name, they are staking their reputation on the accuracy and integrity of that work. This act of standing behind one's words is a powerful signal of quality that an anonymous, AI-generated article can never replicate .

This doesn't mean all AI-generated content is bad. AI can be a powerful tool for summarizing information, generating ideas, and assisting human creators. The distinction is one of intent and oversight. The problem with AI slop is that it is created without purpose beyond its own propagation, lacking a human author who has thought deeply about the topic and is willing to endorse the final product.

As we navigate this new era, we may find ourselves valuing the human touch more than ever. The messy, inefficient, and deeply personal process of human creation becomes a feature, not a bug—a guarantee that someone cared enough to do the work.

Why It Matters

The proliferation of AI slop is more than just a technical nuisance. It is a challenge to the very idea of a shared, reliable source of public knowledge. By flooding our digital spaces with low-grade, synthetic information, it erodes trust, hampers discovery, and threatens to poison the data wells for the very AI systems that created it.

The solution will not be simple. It will require a multi-pronged effort from search engines to better identify quality, from AI developers to pursue more responsible data sourcing, and from all of us, as consumers of information, to become more critical and deliberate in where we place our trust. The age of AI slop is here, and navigating it successfully means learning to look past the noise to find the signal of genuine human insight.

I take on a small number of AI insights projects (think product or market research) each quarter. If you are working on something meaningful, lets talk. Subscribe or comment if this added value.

References
  • Is the internet already ‘mostly fake’? - The Verge (news, 2024-05-31) https://www.theverge.com/2024/5/31/24167608/internet-fake-content-ai-dead-internet-theory -> Discusses the difficulty of quantifying fake/AI content but confirms the trend and the widespread feeling that the internet's quality is declining due to it. It contextualizes the '50%' claim as part of a broader 'dead internet theory'.
  • The People Making AI-Generated Content Spam the Internet - VICE (news, 2023-05-10) https://www.vice.com/en/article/v7b7a9/the-people-making-ai-generated-content-spam-the-internet -> Provides case studies and interviews with people creating AI content farms for profit, confirming the economic incentives described in the article.
  • The AI-Powered Disinformation Landscape - Council on Foreign Relations (org, 2024-09-10) https://www.cfr.org/in-brief/ai-powered-disinformation-landscape -> Details how generative AI is being used by state and non-state actors to create and scale disinformation campaigns, supporting the claim about political motivations.
  • Tracking AI-enabled Misinformation - NewsGuard (dataset, 2025-11-20) https://www.newsguardtech.com/special-reports/ai-tracking-center/ -> Provides a regularly updated database of AI-generated news and information sites that operate with little to no human oversight, offering a concrete, verifiable number to support the scale of the problem.
  • On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? - FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (journal, 2021-03-01) https://dl.acm.org/doi/10.1145/3442188.3445922 -> A foundational paper that discusses the data sourcing for large language models, their environmental and social costs, and their tendency to parrot biases from their vast training data scraped from the internet.
  • The Curse of Recursion: Training on Generated Data Makes Models Forget - arXiv (Cornell University) (whitepaper, 2023-05-29) https://arxiv.org/abs/2305.17493 -> This is a key academic paper that formally describes and analyzes the phenomenon of 'model collapse' (termed 'The Curse of Recursion' here), providing the primary technical evidence for this section of the article.
  • The open internet is dying. The battle to save it is happening in the fediverse. - MIT Technology Review (news, 2024-02-26) https://www.technologyreview.com/2024/02/26/1089019/the-open-internet-is-dying-fediverse-threads-meta-mastodon/ -> Argues that users are moving away from the 'open' but chaotic web towards more curated, community-governed spaces, which supports the 'flight to quality' thesis.
  • Creating helpful, reliable, people-first content - Google Search Central (documentation, 2024-09-12) https://developers.google.com/search/docs/fundamentals/creating-helpful-content -> Official documentation from Google explaining its E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) guidelines, which confirms that search engines are actively trying to prioritize signals of human expertise over low-quality content.
  • The Challenge of Synthetic Data in the Era of Generative AI - QualZ.ai (news, 2024-01-18) https://qualz.ai/the-challenge-of-synthetic-data-in-the-era-of-generative-ai/ -> This article discusses the trade-offs of using synthetic data for AI training, reinforcing the concepts of model collapse and the value of high-quality, human-generated data for maintaining model accuracy.
  • The Wonderful World of AI Slop - YouTube (video, 2024-10-27) -> The original source video that provided the foundational concepts and arguments for this article, including the definition of 'AI slop,' the recipe blog example, and the email spam analogy.

Appendices

Glossary

  • AI Slop: A colloquial term for the high volume of low-quality, often nonsensical or inaccurate, content generated by AI models and published on the internet, typically for economic or political purposes.
  • Model Collapse: A phenomenon where generative AI models, trained recursively on their own synthetic output, begin to degrade in quality. They lose diversity, amplify biases, and become progressively detached from the original ground truth of human-generated data.
  • E-E-A-T: Stands for Experience, Expertise, Authoritativeness, and Trustworthiness. A set of guidelines used by Google to assess the quality of web content, prioritizing pages that demonstrate real-world, expert knowledge and can be trusted by users.

Contrarian Views

  • AI-generated content can increase accessibility, allowing non-writers to share information and ideas more easily.
  • Over time, market forces will naturally filter out the worst 'slop,' and AI models will improve at generating higher-quality, factual content, making the problem temporary.
  • The 'flight to quality' could lead to a more fragmented and elitist internet, where high-quality information is locked behind paywalls or within exclusive communities, widening the information gap.

Limitations

  • The exact percentage of the web that is 'AI slop' is currently impossible to measure accurately and is a subject of ongoing debate.
  • The long-term effects of 'model collapse' are still being researched and may be mitigated by new training techniques or better data curation.
  • This article focuses primarily on text-based content, but similar issues of 'slop' are emerging in AI-generated images, video, and audio.

Further Reading

  • The Curse of Recursion: Training on Generated Data Makes Models Forget (Preprint) - https://arxiv.org/abs/2305.17493
  • NewsGuard: AI Tracking Center - https://www.newsguardtech.com/special-reports/ai-tracking-center/
  • How AI-generated content is ruining the search experience - https://searchengineland.com/how-ai-generated-content-is-ruining-the-search-experience-437379

Recommended Resources

  • Signal and Intent: A publication that decodes the timeless human intent behind today's technological signal.
  • Blue Lens Research: AI-powered patient research platform for healthcare, ensuring compliance and deep, actionable insights.
  • Outcomes Atlas: Your Atlas to Outcomes — mapping impact and gathering beneficiary feedback for nonprofits to scale without adding staff.
  • Lean Signal: Customer insights at startup speed — validating product-market fit with rapid, AI-powered qualitative research.
  • Qualz.ai: Transforming qualitative research with an AI co-pilot designed to streamline data collection and analysis.

Ready to transform your research practice?

See how Thesis Strategies can accelerate your next engagement.