[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [AMIA-L] Speech To Text Technology
Captioning is done manually with a real person transcribing.
The radio transcript project I mentioned wants to be automated, and the
challenge is not just different voices, but also different audio levels
and sound qualities (remote broadcasts, ambiant background noises,
telephone hooik-ups, etc.)
So the challenge is even greater to 'train' a system.
Nan Rubin
-----Original Message-----
From: Association of Moving Image Archivists [mailto:AMIA-L@xxxxxxxxxxx]
On Behalf Of Albert Steg
Sent: Friday, November 30, 2007 12:23 PM
To: AMIA-L@xxxxxxxxxxx
Subject: Re: [AMIA-L] Speech To Text Technology
Rick, which software prodcut was that? Might it have been Naturally
Speaking?
http://www.nuance.com/naturallyspeaking/business/
It's not surprising that voice recognition will be a much tougher nut
to crack than text recognition -- while it's tough enough training a
computer to see that 'a' and 'A' are the same letter, the mind
boggles at what it would take for an automated system to transcribe a
conversation between 2 or 3 people with differing accents and voice
timbres -- without ever having been able to benchmark them against a
known text.
Do captioned broadcasts actually pull this off? I always figured
they had a live person doing a sort of stenography job.
Albert
On Nov 30, 2007, at 11:47 AM, Rick Prelinger wrote:
> I tried it. The off-the-shelf software is said to work well when
> it gets input from a single speaker who has spent the time going
> through the routine to train the program to recognize his or her
> voice. This involves reading a specific script into a mike for
> 15-20 minutes. My trials showed that I could train it to recognize
> my own voice and thereby automate notetaking or at best the
> production of a first-draft text, but even a sloppy typist like me
> can produce cleaner copy. When I presented it with different
> voices, it produced gibberish. I had hoped to use it to convert
> voices heard on the radio to text, but that was beyond the scope of
> what you can source in the non-classified world.
>
> Rick
>
>
> --
>
> Rick Prelinger
> Prelinger Archives http://www.prelinger.com
> P.O. Box 590622, San Francisco, Calif. 94159-0622 USA
> footage@xxxxxxxxx