One has to understand that there are _two_ aspects to voice recognition: One is understanding the speech, so to determine what people actually said. That's what Nuance does and what they are pretty good at (even Siri and Google are said to use their system).
The other is the AI to determine what the user wants to tell the system, which is all about semantics and that's what Siri, Google at al are working on. The speech recognition itself is pretty much solved, I've seen impressive systems (e.g. from Nuance but also Dragon etc.) almost 20 years ago. The problem for actual voice interaction, though, has always been to determine what the user wants... A sub-problem of this is to detect which language a phrase or even sub-phrase is spoken in. Here I'm underwhelmed of some systems, especially Siri. I usually use my phone in English which solves the "Music" problem a good part (except for German music, of course) but it means I can no longer navigate in Germany because for the life of me I can't figure what Siri thinks how "Hackescher Markt" should be pronounced in English :) And that would actually be such an easy problem because hey, Siri knows where I am so she should be aware street names are German in Berlin... How good a solution VR is pretty much depends on the expectations. I know people who like it a lot and who are willing to learn hacks to use it: netchord wrote: > > for weird pronunciations (AC/DC, Sade, 311) and natural language > examples (CCR, The Boss, The King, The Stones) one needs specific rules > hard-coded (AC/DC= ack slash Dee See) in to the AVR database. a lot of > work went into this several years ago. additionally, there was a lot of > time spent optimizing systems for non-native english speakers. it could > always be better of course, and similar work would need to be done of > other languages. > > point is these problems are solvable, so i'm wondering what experiences > one could enable. That's the point: some people think VW will understand them like they _mean_ things (not even other people will always understand what you _say_). They are usually quickly disappointed. Then there are people (like me) who find it deeply awkward to talk to their hardware aloud in the presence of other people (and I find it disturbing if others do it) which significantly limits the usefulness if you are not home alone. And then there are people who love to have a complete hands-off interaction and are willing to learn to use the system (and learn what kind of pronunciation it expects), for them this can already work really well even today. I think voice interaction will always stay an interaction model that only works for part of the population but I'm pretty sure it can work pretty well for those who like it. --- learn more about iPeng, the iPhone and iPad remote for the Squeezebox and Logitech UE Smart Radio as well as iPeng Party, the free Party-App, at penguinlovesmusic.com *New: iPeng 9, the Universal App for iPhone, iPad and Apple Watch* ------------------------------------------------------------------------ pippin's Profile: http://forums.slimdevices.com/member.php?userid=13777 View this thread: http://forums.slimdevices.com/showthread.php?t=105674 _______________________________________________ discuss mailing list discuss@lists.slimdevices.com http://lists.slimdevices.com/mailman/listinfo/discuss