Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Andre Natal Tue, 18 Nov 2014 23:17:57 -0800

Chris,

I was discussing with sphinx leaders and we can build models from
audiobooks as well.


This approach saves a lot of time and enhances the quality since the
narrative is well accurate and clear.

We are currently defining a way to create hindi and brazilian portuguese
models.

Thanks

Andre
On Oct 30, 2014 5:47 PM, "Chris Hofmann" <chofm...@mozilla.com> wrote:

> On 10/30/14 5:24 PM, smaug wrote:
>
>> On 10/31/2014 02:21 AM, smaug wrote:
>>
>>> Intent to ship is too strong for this.
>>> We need to first have implementation landed and tested ;)
>>>
>>> I wouldn't ship the implementation in desktop FF without plenty of more
>>> testing.
>>>
>>>
>> But I guess the question is what people think about shipping the
>> pocketspinx + API, even if disabled by default.
>>
>> Andre, we need some numbers here. How much does Pocketsphinx increase
>> binary size? or download size?
>> When the pref is enabled, how much does it use memory on desktop, what
>> about on b2g?
>>
>>
>>  This is important work and the competition is ramping quicky after many
> years of promises about this year being the year of voice recognition.  We
> will probably fall behind quickly if we don't get something going here in
> the next year.
>
> Can you also talk a bit about what the plan and set of challenges look
> like for expanding the supported languages, and how these would impact the
> numbers ollie has asked for?
>
> The place we really need this is b2g, but phones are only shipping in
> international markets right now so english only is not all that helpful.
>
> -chofmann
>
>
>>>
>>> -Olli
>>>
>>>
>>> On 10/31/2014 01:18 AM, Andre Natal wrote:
>>>
>>>> I've been researching speech recognition in Firefox for two years. First
>>>> SpeechRTC, then emscripten, and now Web Speech API with CMU pocketsphinx
>>>> [1] embedded in Gecko C++ layer, project that I had the luck to develop
>>>> for
>>>> Google Summer of Code with the mentoring of Olli Pettay, Guilherme
>>>> Gonçalves, Steven Lee, Randell Jesup plus others and with the
>>>> management of
>>>> Sandip Kamat.
>>>>
>>>> The implementation already works in B2G, Fennec and all FF desktop
>>>> versions, and the first language supported will be english. The API and
>>>> implementation are in conformity with W3C standard [2]. The preference
>>>> to
>>>> enable it is: media.webspeech.service.default = pocketsphinx
>>>>
>>>> The required patches for achieve this are:
>>>>
>>>>   - Import pocketsphinx sources in Gecko. Bug 1051146 [3]
>>>>   - Embed english models. Bug 1065911 [4]
>>>>   - Change SpeechGrammarList to store grammars inside SpeechGrammar
>>>> objects.
>>>> Bug 1088336 [5]
>>>>   - Creation of a SpeechRecognitionService for Pocketsphinx. Bug
>>>> 1051148 [6]
>>>>
>>>>
>>>> Also, other important features that we don't have patches yet:
>>>>   - Relax VAD strategy to be les strict and avoid stop in the middle of
>>>> speech when speaking low volume phonemes [7]
>>>>   - Integrate or develop a grapheme to phoneme algorithm to realtime
>>>> generator when compiling grammars [8]
>>>>   - Inlcude and build models for other languages [9]
>>>>   - Continuous and wordspotting recognition [10]
>>>>
>>>> The wip repo is here [11] and this Air Mozilla video [12] plus this wiki
>>>> has more detailed info [13].
>>>>
>>>> At this comment you can see a cpu usage on flame while recognition is
>>>> happening [14]
>>>>
>>>> I wish to hear your comments.
>>>>
>>>> Thanks,
>>>>
>>>> Andre Natal
>>>>
>>>> [1] http://cmusphinx.sourceforge.net/
>>>> [2] https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html
>>>> [3] https://bugzilla.mozilla.org/show_bug.cgi?id=1051146
>>>> [4] https://bugzilla.mozilla.org/show_bug.cgi?id=1065911
>>>> [5] https://bugzilla.mozilla.org/show_bug.cgi?id=1088336
>>>> [6] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148
>>>> [7] https://bugzilla.mozilla.org/show_bug.cgi?id=1051604
>>>> [8] https://bugzilla.mozilla.org/show_bug.cgi?id=1051554
>>>> [9] https://bugzilla.mozilla.org/show_bug.cgi?id=1065904 and
>>>> https://bugzilla.mozilla.org/show_bug.cgi?id=1051607
>>>> [10] https://bugzilla.mozilla.org/show_bug.cgi?id=967896
>>>> [11] https://github.com/andrenatal/gecko-dev
>>>> [12] https://air.mozilla.org/mozilla-weekly-project-meeting-20141027/
>>>> (Jump
>>>> to 12:00)
>>>> [13] https://wiki.mozilla.org/SpeechRTC_-_Speech_enabling_the_open_web
>>>> [14] https://bugzilla.mozilla.org/show_bug.cgi?id=1051148#c14
>>>>
>>>>
>>>
>> _______________________________________________
>> dev-platform mailing list
>> dev-platform@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-platform
>>
>
> _______________________________________________
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Intent to ship: Web Speech API - Speech Recognition with Pocketsphinx

Reply via email to