Am 30.08.20 um 17:25 schrieb MRAB:
On 2020-08-30 07:23, Muskan Sanghai wrote:
On Sunday, August 30, 2020 at 11:46:15 AM UTC+5:30, Chris Angelico wrote:
I recommend looking into CMU Sphinx then. I've used that from Python.
The results are highly entertaining.
ChrisA
Okay I will try it, thank you.

Speech recognition works best when there's a single voice, speaking clearly, with little or no background noise. Movies tend not to be like that.

Which is why the results are "highly entertaining"...


Well, with enough effort it is possible to build a system that is more useful than "entertaining". Google did that, English youtube videos can be annotated with subtitles from speech recognition. For example, try this video:
https://www.youtube.com/watch?v=lYVLpC_8SQE

Go to the settings thing (the little gear icon in the nav bar) and switch on subtitles, English autogenerated. You'll see a word-by-word transcription of the text, and most of it is accurate.

There are strong arguments that anything one can build with open source tools will be inferior. 1) They'll probably have a bunch of highly qualified KI experts working on this thing 2) They have an enormous corpus of training data. Many videos already have user-provided subtitles. They can feed all of this into the training.

I'm waiting to be disproven on this point ;)

        Christian
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to