Chris, I was discussing with sphinx leaders and we can build models from audiobooks as well.
This approach saves a lot of time and enhances the quality since the narrative is well accurate and clear. We are currently defining a way to create hindi and brazilian portuguese models. Thanks Andre On Oct 30, 2014 5:47 PM, "Chris Hofmann" <chofm...@mozilla.com> wrote: > On 10/30/14 5:24 PM, smaug wrote: > >> On 10/31/2014 02:21 AM, smaug wrote: >> >>> Intent to ship is too strong for this. >>> We need to first have implementation landed and tested ;) >>> >>> I wouldn't ship the implementation in desktop FF without plenty of more >>> testing. >>> >>> >> But I guess the question is what people think about shipping the >> pocketspinx + API, even if disabled by default. >> >> Andre, we need some numbers here. How much does Pocketsphinx increase >> binary size? or download size? >> When the pref is enabled, how much does it use memory on desktop, what >> about on b2g? >> >> >> This is important work and the competition is ramping quicky after many > years of promises about this year being the year of voice recognition. We > will probably fall behind quickly if we don't get something going here in > the next year. > > Can you also talk a bit about what the plan and set of challenges look > like for expanding the supported languages, and how these would impact the > numbers ollie has asked for? > > The place we really need this is b2g, but phones are only shipping in > international markets right now so english only is not all that helpful. > > -chofmann > > >>> >>> -Olli >>> >>> >>> On 10/31/2014 01:18 AM, Andre Natal wrote: >>> >>>> I've been researching speech recognition in Firefox for two years. First >>>> SpeechRTC, then emscripten, and now Web Speech API with CMU pocketsphinx >>>> [1] embedded in Gecko C++ layer, project that I had the luck to develop >>>> for >>>> Google Summer of Code with the mentoring of Olli Pettay, Guilherme >>>> Gonçalves, Steven Lee, Randell Jesup plus others and with the >>>> management of >>>> Sandip Kamat. >>>> >>>> The implementation already works in B2G, Fennec and all FF desktop >>>> versions, and the first language supported will be english. The API and >>>> implementation are in conformity with W3C standard [2]. The preference >>>> to >>>> enable it is: media.webspeech.service.default = pocketsphinx >>>> >>>> The required patches for achieve this are: >>>> >>>> - Import pocketsphinx sources in Gecko. Bug 1051146 [3] >>>> - Embed english models. Bug 1065911 [4] >>>> - Change SpeechGrammarList to store grammars inside SpeechGrammar >>>> objects. >>>> Bug 1088336 [5] >>>> - Creation of a SpeechRecognitionService for Pocketsphinx. Bug >>>> 1051148 [6] >>>> >>>> >>>> Also, other important features that we don't have patches yet: >>>> - Relax VAD strategy to be les strict and avoid stop in the middle of >>>> speech when speaking low volume phonemes [7] >>>> - Integrate or develop a grapheme to phoneme algorithm to realtime >>>> generator when compiling grammars [8] >>>> - Inlcude and build models for other languages [9] >>>> - Continuous and wordspotting recognition [10] >>>> >>>> The wip repo is here [11] and this Air Mozilla video [12] plus this wiki >>>> has more detailed info [13]. >>>> >>>> At this comment you can see a cpu usage on flame while recognition is >>>> happening [14] >>>> >>>> I wish to hear your comments. >>>> >>>> Thanks, >>>> >>>> Andre Natal >>>> >>>> [1] http://cmusphinx.sourceforge.net/ >>>> [2] https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html >>>> [3] https://bugzilla.mozilla.org/show_bug.cgi?id=1051146 >>>> [4] https://bugzilla.mozilla.org/show_bug.cgi?id=1065911 >>>> [5] https://bugzilla.mozilla.org/show_bug.cgi?id=1088336 >>>> [6] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148 >>>> [7] https://bugzilla.mozilla.org/show_bug.cgi?id=1051604 >>>> [8] https://bugzilla.mozilla.org/show_bug.cgi?id=1051554 >>>> [9] https://bugzilla.mozilla.org/show_bug.cgi?id=1065904 and >>>> https://bugzilla.mozilla.org/show_bug.cgi?id=1051607 >>>> [10] https://bugzilla.mozilla.org/show_bug.cgi?id=967896 >>>> [11] https://github.com/andrenatal/gecko-dev >>>> [12] https://air.mozilla.org/mozilla-weekly-project-meeting-20141027/ >>>> (Jump >>>> to 12:00) >>>> [13] https://wiki.mozilla.org/SpeechRTC_-_Speech_enabling_the_open_web >>>> [14] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148#c14 >>>> >>>> >>> >> _______________________________________________ >> dev-platform mailing list >> dev-platform@lists.mozilla.org >> https://lists.mozilla.org/listinfo/dev-platform >> > > _______________________________________________ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform > _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform