One has to understand that there are _two_ aspects to voice
recognition:

One is understanding the speech, so to determine what people actually
said. That's what Nuance does and what they are pretty good at (even
Siri and Google are said to use their system).

The other is the AI to determine what the user wants to tell the system,
which is all about semantics and that's what Siri, Google at al are
working on.

The speech recognition itself is pretty much solved, I've seen
impressive systems (e.g. from Nuance but also Dragon etc.) almost 20
years ago. The problem for actual voice interaction, though, has always
been to determine what the user wants...
A sub-problem of this is to detect which language a phrase or even
sub-phrase is spoken in.
Here I'm underwhelmed of some systems, especially Siri. I usually use my
phone in English which solves the "Music" problem a good part (except
for German music, of course) but it means I can no longer navigate in
Germany because for the life of me I can't figure what Siri thinks how
"Hackescher Markt" should be pronounced in English :) And that would
actually be such an easy problem because hey, Siri knows where I am so
she should be aware street names are German in Berlin...

How good a solution VR is pretty much depends on the expectations. I
know people who like it a lot and who are willing to learn hacks to use
it:

netchord wrote: 
> 
> for weird pronunciations (AC/DC, Sade, 311) and natural language
> examples (CCR, The Boss, The King, The Stones) one needs specific rules
> hard-coded (AC/DC= ack slash Dee See) in to the AVR database.  a lot of
> work went into this several years ago.  additionally, there was a lot of
> time spent optimizing systems for non-native english speakers.  it could
> always be better of course, and similar work would need to be done of
> other languages.
> 
> point is these problems are solvable, so i'm wondering what experiences
> one could enable.

That's the point: some people think VW will understand them like they
_mean_ things (not even other people will always understand what you
_say_). They are usually quickly disappointed.
Then there are people (like me) who find it deeply awkward to talk to
their hardware aloud in the presence of other people (and I find it
disturbing if others do it) which significantly limits the usefulness if
you are not home alone.
And then there are people who love to have a complete hands-off
interaction and are willing to learn to use the system (and learn what
kind of pronunciation it expects), for them this can already work really
well even today.

I think voice interaction will always stay an interaction model that
only works for part of the population but I'm pretty sure it can work
pretty well for those who like it.



---
learn more about iPeng, the iPhone and iPad remote for the Squeezebox
and
Logitech UE Smart Radio as well as iPeng Party, the free Party-App, 
at penguinlovesmusic.com
*New: iPeng 9, the Universal App for iPhone, iPad and Apple Watch*
------------------------------------------------------------------------
pippin's Profile: http://forums.slimdevices.com/member.php?userid=13777
View this thread: http://forums.slimdevices.com/showthread.php?t=105674

_______________________________________________
discuss mailing list
discuss@lists.slimdevices.com
http://lists.slimdevices.com/mailman/listinfo/discuss

Reply via email to