Thanks or the feedback Eric. Is it really this hopeless? You talked about
the Sphinx projects being okay - but not ready for normal users. To what
extent are they capable? I'd really love to know if you or anyone else has
tried them.

I have looked into them but haven't had the time (and not being a very
capable technical user) to get them going, orto get them going nicely. If I
knew how well they worked, I'd probably be more inclined to use the time I
don't have getting them working.

Chris Hayes


On 19/02/07, Eric S. Johansson <[EMAIL PROTECTED]> wrote:

Chris Hayes wrote:
> Hi - I was wondering whether anyone here might know about what voice
> recognition software is currently available for Linux.

(warning, I am an unrepentant curmudgeon and negative filter.  Interpret
the following accordingly.  If I'm wrong on any points, and someone
wants to correct me, I will gladly learn.)

In a nutshell, not much.  Sphinx 4, and others of its family, you have
some fairly decent recognition systems.  However, they are not ready for
prime time because if they were, people would be using them for desktop
recognition.  while the recognition engines may work well, a lot of the
ancillary pieces such as training, dealing with microphone switching,
dictionary management etc. are not quite there yet.  On the other hand,
the same shortcomings can be laid at the feet of Linux and Windows audio
subsystems.

from my perspective, the only usable speech recognition for end users is
naturally speaking.  There may be something on a Macintosh but I don't
have any experience there.  The reason I say NaturallySpeaking is the
only usable one is because it's a large vocabulary continuous speech
recognition system people used to get work done.  Recognition engine,
language model, sound system interface, etc. etc.. have had many years
to evolve.  nuance has had a couple of years to screw it up and they've
done a wonderful job at it.  I think the only positive contribution they
have made during their stewardship of the product is the addition of a
Bluetooth microphone audio model.

The only way to get good speech recognition on Linux is for someone to
drop a small number of millions of dollars into nuance's lap and pray.
Not a good solution.

I've been thinking about an alternative model for a couple of years in
between other projects but I do believe the best solution (best defined
as getting handicapped people working), would be to make use of Windows
and Linux via virtual machines.  Since virtual machines do horrible
things to sound systems, I would recommend using Windows as a host OS
with speech recognition, a mediator to transfer
characters/commands/keystrokes to the Linux environment and a mediator
to return window state information such as screen content, application
running etc. etc.)

There has been a primitive instance (which this has been taken off the
net) to show the technique is fundamentally sound.  a full function
mediator, while difficult, is a couple orders of magnitude or more
easier to build than moving a large and complicated windows application
to Linux.

in the short-term, run Linux on a virtual machine,  display apps via X11
server, and use something like natpython and one of its macro packages
to build commands for Linux applications.  nattext still bite you in the
ass  with all the random characters and inserts in applications but,
that's nuances contribution.

---eric

--
Speech-recognition in use.  It makes mistakes, I correct some.

-- 
Ubuntu-accessibility mailing list
Ubuntu-accessibility@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-accessibility

Reply via email to