Speech recognition software used for records of Holocaust survivors



Wired News
Software Scours Holocaust Records
http://www.wired.com/news/print/0,1294,47584,00.html
By Kendra Mayfield

2:00 a.m. Oct. 22, 2001 PDT

The voices are muddled with thick, emotional accents, revealing both tragic
and heroic eyewitness accounts from thousands of Holocaust survivors and
witnesses.

But while these videotaped tracks are imperfect, they are invaluable to
historians and generations to come.

Researchers from Johns Hopkins University, IBM and the University of
Maryland are developing speech recognition software to allow historians and
scholars to search through more than 51,000 video interviews from Holocaust
survivors, witnesses and liberators.

The interviews will be garnered from the Survivors of the Shoah Visual
History Foundation (www.vhf.org), which contains the world's largest coherent
archive of videotaped oral histories with 116,000 hours of digitized interviews
in 32 languages from 57 countries.

The National Science Foundation recently awarded $7.5 million, distributed
over five years, to help fund the project
(http://www.clsp.jhu.edu/research/malach/) and develop a new system that's
capable of recognizing key words and phrases in new languages.

"This is one of the few projects tackling so many things at once on such a
scale," said Bill Byrne, associate research professor for the Center for
Language and Speech at Johns Hopkins. "This is a real application we're
trying to solve."

"Our original mission to collect 50,000 testimonies is now complete," said
Doug Greenberg, president and CEO of the Shoah Foundation. "Our mission now
is to use the archive in educational settings to overcome prejudice and
bigotry."

The Sept. 11 attacks make the Shoah Foundation's mission to unlock its
archive even more essential, Greenberg said.

"Sept. 11 was about a lot of things but it was also about hatred,"
Greenberg said. "We have 50,000 educators whose testimony really speaks to
the evil that hatred in the world makes."

Researchers have already begun manually reviewing English language tapes
and indexing them according to times, places and incidents described in
each interview.

But the time it takes to manually index, summarize, research and review a
collection this size is daunting: Just one single testimony consumes an
average of 35 hours. So far, it has cost the Shoah Foundation approximately
$8 million to catalog just 4,000 of these interviews.

Armed with the NSF grant, researchers hope to advance speech recognition
technology and create an "audio search engine" that will lower costs and
speed the cataloging process.

The research team is initially building speech recognition systems to
process interviews conducted in Czech. They will then explore opportunities
to develop systems in other Central European languages.

One of the trickiest challenges is getting a computer to understand
different dialects and languages, said Sam Gustman, executive director of
technology for the Shoah Foundation.

Fully automatic technology for accessing archives is currently inadequate,
researchers say.

Most commercial speech recognition systems are designed for broadcast news.
The technology is most reliable when it is asked to recognize a limited
number of words or phrases that are spoken slowly or clearly.

But the Shoah collection poses unique challenges. Many of the interviews
are conducted in Central and Eastern European languages that don't have
speech recognition systems. Often a speaker will switch between different
languages, alternating from English to Yiddish in mid-sentence.

"As you get into other languages, the technology isn't there yet," Byrne said.

"We don't have the technology to do large-scale search yet (on recorded
conversations)," said Douglas Oard, an assistant professor at the
University of Maryland. "If we had the capability to search, it would
change the way we do things."

Many of the interviews are difficult to understand because speakers have
heavy accents or are highly emotional when they recount their experiences.

"These people are speaking about things that had great impact in their
lives," Oard said. "This makes speech recognition that's designed for
broadcast news fail."

Gustman agreed: "We're not able to process this stuff in English, let alone
other languages."

Rather than transcribing interviews word-for-word, the software will
identify key search terms and phrases.

"We don't edit any of these interviews," Greenberg said. "It's completely
raw footage taken directly from interviews with survivors. It will be
broadly accessible, but it won't be edited."

"It isn't as good as a human cataloging, but it's $100 million cheaper,"
Oard said. "We're going to drive costs of doing this down to a point where
applications are possible that are now infeasible."

The potential applications for a Web search engine based on conversational
speech are endless. The technology could be used to scour oral histories
for projects that might otherwise be financially infeasible, from civil
rights to the space program.

"There's a lot more oral history than anybody even knows about," Oard said.

The technology could eventually be applied to other recorded conversations,
such as speech-enabled cell phones.
"When you develop this type of technology, you open a lot of doors," Oard
said.


Copyright © 1994-2001 Wired Digital Inc. All rights reserved.




========== HURIDOCS-Tech listserv ==========
Send mail intended for the list to <huridocs-tech@hrea.org>.
Archives of the list can be found at: http://www.hrea.org/lists/huridocs-tech/
To subscribe to the list, send a message to <majordomo@hrea.org>,
with the following text in the message: subscribe huridocs-tech
To unsubscribe from the list, send a message to <majordomo@hrea.org>,
with the following text in the message: unsubscribe huridocs-tech
If you have problems (un)subscribing, contact <owner-huridocs-tech@hrea.org>.


[Reply to this message] [Start a new topic] [Date Index] [Thread Index] [Author Index] [Subject Index] [List Home Page] [HREA Home Page]