Page 1 of 1

Anyone can search the TV News Archive

Posted: Wed Jul 09, 2025 4:10 am
by nurnobi40
That’s the research question that the San Francisco-based firm Joostware concentrated on for its Who Said What project, which won a $50,000 prototype grant from the John S. and James L. Knight Foundation. Last week Joostware’s founder, Delip Rao, presented the project’s progress at a gathering in Austin, Texas. (The Internet Archive’s own Dan Schultz, in his Bad Idea Factory incarnation, also presented on Contextubot, which we recently profiled here.)

“Audio and video today is viewed as an opaque object and it’s meant for linear consumption,” Rao said in his presentation. “But truly any audio and video especially in the context of news has a lot of structure to it. There are speakers of interest, and these speakers take turns, and then within each turn something was communicated. So our goal is to identify these speakers who are of interest and also the content that was spoken in that turn and indexing that.”

already via closed captions at the Internet Archive or via telemarketing data Television Explorer. Our experiments with facial detection and chyron extraction are another way to find and analyze news clips. But searching a video archive by “speaker id” – finding all the video where a person is actually talking – is a tough technical challenge. Our Trump Archive and congressional, executive branch, and administration archives are all manually curated video collections designed to demonstrate what it would be like to have automated speaker id search.

Joostware researchers have made progress toward this goal. They took material from the Trump Archive, and used it to train a model that recognizes the president’s voice, by using properties of the voice signal. They created a prototype search software that is more than 95% accurate on a human annotated dataset in returning video clips where Trump is actually speaking.

What’s next? With more resources, Joostware hopes to give this technology back to the Internet Archive to improve search within the TV News Archive. And Rao and others continue to work within the larger community of researchers working to crack the code of video to help fact-checkers and journalists hold power accountable.