Second Life already supports all of these things but they're not
terribly well tuned. Someone already mentioned that it supports
voice-triggered gestures as well as lip sync but the quality is
questionable. It also supports a rich avatar attention system that I did
a lot of work on to allow it
This is my first post and I hope I am not far off topic here. If I am, I ask your tolerance.
I have been following this thread concerning user tracking to animate the Avatar. I would like to point out an interesting approach used by There.com. There.com uses speech recognition to extract cues