Re: NSSpeechRecognizer and Speech Recognition calibration

Ricky Sharp Sat, 27 Dec 2008 05:48:17 -0800


On Dec 26, 2008, at 4:56 AM, Christopher Corbell wrote:

I'm working on an accessibility app for the visually impaired andwas hoping
to use NSSpeechRecognizer.

I've found it extremely difficult to get NSSpeechRecognizer to behave
predictably on my system. Does anyone on the list have experiencewith thisclass & success with the Speech Recognition system preferencepanel? Any
tips or tricks?
I find that that calibration dialog for the Speech Recognitionsettingsdoesn't work at all for me. I'm using a pretty standard externalmicrophone(built-in to a Logitech Webcam) with an intel Mac Mini. I can seemy signaljust fine and I'm speaking clearly in as accent-neutral a way as Ican, and
still none of the test sentences ever highlights.  Is a headset mic
typically required, or is there some other gotcha here?

It must be your particular setup. I've been doing SR ever since itdebuted (Mac OS 8.x days) and have not had trouble when words/phrasesare unique enough (as yours clearly are).

When I give NSSpeechRecognizer a very small and unambiguous commandset, Ifind it badly misses the mark. For example I might have "Play","Next", and"Stop" in my command set, and it will interpret "Next" as "Play",but itwill never interpret "Play" as a command - pretty unusable, I'mhoping it's
just a calibration issue.

Since the calibration dialog isn't working for you, it's notsurprising that it's getting your phrases confused. Make sure to getyour setup working in the calibration area first.

One last note - is there any way to do proper dictation with thisclass orwill it only recognize the preset command list you give it? I'mthinkingfor example of prompting for a file name to save to, or a term tosearch on
- it would be nice to have true dictation, otherwise I'll resort to
providing an alphabet as a command set so the user can spell it out
(assuming I can get that to work).

No. And, you definitely do _not_ want to add letters to your languagemodel. English letters have too many cases where sounds are extremelysimilar: 'B', 'C', 'D', 'E', 'G', 'P', 'T', 'V', 'Z' for probably thelargest set.

When I worked on numeric input, I had to offer two modes (twodifferent speech models driven by user-preference). For example,'sixteen' and 'sixty' were often confused. This got better over timethough, but still not 100%. For users that had trouble, they couldswitch to the other model in which they needed to speak individualdigits instead: 'one six' and 'six zero'. Now the phrases were uniqueenough to remove any confusion.

You really only have two options: (1) The user has a 3rd-partydictation solution or (2) your solution uses words/phrases for letterinput. For example the military alphabet (alpha, bravo, charlie,etc.) which was designed to work over very low-quality audio situations.


___________________________________________________________
Ricky A. Sharp         mailto:rsh...@instantinteractive.com
Instant Interactive(tm)   http://www.instantinteractive.com



_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: NSSpeechRecognizer and Speech Recognition calibration

Reply via email to