New Global Benchmark Sets Standard for Truly Effective AI Detection in the Age of Deepfakes

WITNESS launches TRIED Benchmark to promote the development of AI detection tools that support the preservation of truth and authenticity in the era of evolving synthetic media.

WITNESS today announces the release of its new report, TRIED: Truly Innovative and Effective AI Detection Benchmark, a groundbreaking framework to evaluate and improve AI detection tools amid the rising threat of generative AI and deceptive synthetic media. Based on case submissions from WITNESS’ Deepfake Rapid Response Force’s (DRRF), the expertise of its AI detection specialists, as well as global consultations with journalists, fact-checkers and leading technologists, WITNESS endeavours to shape the AI detection space for effectiveness and innovation. 

“We’re in the middle of great leaps in the realism of AI generation, with tools like Google’s Veo. However, detection tools for deceptive AI have not kept pace and are not in the hands of those who need them most. We must ensure that tools are built that meet the needs of those on the frontlines of truth globally,” said Sam Gregory, Executive Director at WITNESS, “low-quality or faulty detection that doesn’t work globally, and that is not easily explainable to the public, makes it easy to deny the real, and share AI falsehoods”

From Ukraine to Ghana, India to Mexico, the report draws on frontline experience, expert analysis, and over 40 real-world cases to reveal how current AI detection tools often fall short in real-life, high-stakes contexts, particularly in the Global Majority. In one striking example, 36% of cases analyzed by the DRRF failed to yield reliable results due to factors like poor media quality or unrepresentative training data.

The TRIED Benchmark outlines six core pillars for evaluating detection tools:

  • Performance in real-world conditions
  • Transparency and explainability
  • Targeted accessibility
  • Fairness and representation
  • Durability and resilience
  • Integration with broader verification workflows

It offers a practical checklist for developers, policymakers, and regulators to assess and strengthen AI detection capabilities—not just by accuracy, but by usability, equity, and public interest alignment.

The report also includes recommendations for governments, standards bodies, and AI developers to:

  • Incorporate sociotechnical considerations into future AI evaluations and practices.
  • Invest in public-interest detection tools.
  • Set minimum standards for fairness, explainability and durability benchmarks.
  • Integrate TRIED into global regulatory frameworks.

As generative AI rapidly evolves, the TRIED Benchmark helps ensure that detection technologies serve and not undermine global information integrity and human rights.

Read shirin anlen, one of the TRIED report author’s, op-ed in Tech Policy Press here

Download the full arxiv report here.



Top

Join our fight to fortify the truth and defend human rights.

Take me there

Support Our Work!