Are Large Language Models The New Disinformation Frontier In 2025?
- Hod Fleishman
- Mar 12
- 4 min read

A network of Russian news websites aims to impact the core data consumed by Large Language Models.
Generative AI and the Large Language Models (LLMs) that serve as their foundation are rapidly becoming the new go-to source of information globally on issues big and small. Instead of “ask Google,” - “ask ChatGPT.” The companies in this space, be it OpenAI, Google, Meta, or Anthropic, compete in stating which latest model is superior to its competitors. However, these foundational models rely on the accuracy of the data they digest. As always, garbage in, garbage out. Suppose, for example, millions of articles supporting the theory that the earth is flat would suddenly appear online. The quantity of knowledge aimed at a single topic will affect these models' answers. In the world of LLMs, quantity indicates quality, and here lies an Achilles heel now exploited by bad players.
Over recent years, data emerged pointing to suspicious activities by Russian news sites with a new objective. A new kind of influence campaign emerged. Unlike traditional Influence Campaigns, where the direct target is people and their opinions, this new breed aims to influence large language models, thus creating a lasting influence well beyond the scope of the most successful time-limited campaign. The methodology is simple and effective but requires significant brute force. The aim: to influence the answers these models provide.
The Pravda Network
In this example, the network responsible for contaminating and disrupting large language models is an established network of existing and new pseudo-news sites that amplify content supporting the Kremlin's arguments, reasoning, and points of view. These news sites target different geographical regions and appear in multiple languages. Interestingly, these sites do not produce original materials but copy, paste, spread, and amplify existing content - all supporting the Kremlin. How extensive is this network, and can it trick Large Language Models into digesting their content and swaying the pendulum of replies provided by these systems toward Moscow? Here are a few critical data points, provided by NewsGuard and other sources (further reading list at the end of the article) about this network, how it works, and the results it provides:
150+ different domains aggregate content from Russian sources
Over 200 new domains joined the network from July 2022 to January 2025
These domains targeted 49 different countries in 46 different languages.
And here is the punchline: According to The American Sunlight Project, this network amplifies 3.6 million Kremlin-supporting articles yearly.
But Does It Work?
NewsGuard, a leader in information reliability, tested several chatbots operated by ten of the largest AI providers and found that the false Russian disinformation narratives; answers that relied on content spread by Pravda network, repeated 33.55 percent of the time, provided a non-response 18.22 percent of the time, and debunked 48.22 percent of the time: "In total, 56 out of 450 chatbot-generated responses included direct links to stories spreading false claims published by the Pravda network of websites. Collectively, the chatbots cited 92 different articles from the network containing disinformation, with two models referencing as many as 27 Pravda articles each from domains in the network"
Implications for Businesses, Governments, and LLMs End Users:
Since the rise of Generative AI and Large Language Models, every professional workflow has been revisited, from how designers create new logos to how attorneys examine multiple contracts and how students research. We use LLMs to co-create, research, study, and support daily and business decisions. So it goes without saying that there has to be trust between us and this emerging technology. Studies have shown that LLMs are biased in gender and political topics. However, the Pravda Influence Campaign is the first exposed attempt to deliberately influence the data these LLMs consume to skew the answers they provide to our prompts.
There are several lines of defense available, and the first is knowledge. Organizations, be they businesses, NGOs, or government offices, must be made aware of this and similar developments, employ critical thinking, and work both to monitor and detect such manipulations as well as provide a counter-balance by making more high-quality information accessible. Individuals must stay alert and question the output provided by such advanced tools. While many companies are in the new field of AI governance and safety (for example, HiddenLayer and Lasso), the focus on targeting deliberate LLM contamination as part of disinformation efforts is a relatively new frontier, which may require a collaborative approach with researchers and big tech players. Shifting from large to small LLMs could also help control the data these models consume...
The battle for correct information goes beyond agreement on political opinions. It is now fought at the level of the AI tools we all use daily.
Are you interested in this topic? Do you work on data reliability and LLM integrity issues, or want to explore how this impacts the issues you work on, and what can be done to build resilience and mitigation? We’d love to hear from you.
Further reading:
Comments