At 01:32 PM 03/01/2000 -0500, Mike Fletcher wrote:
>Minor note:
> Instrumental voice is a fairly well-researched topic. There are no
>known algorithms (at my last literature search 6 months ago) to accomplish
>the encoding in real time (there are a number of decoding algos which
>operate in real time, and a number of encoding algos that run in near-real
>time).
Cool, thanks Mike for putting a name to this. I kept getting the feeling
that this was such a simple way of doing it that _somebody_ must have come
up with it before.
Hmmm... I am surprised that there are no real-time encoders. Maybe for a
more general-purpose system real-time encoding is not possible, but this
system is specialised for voice, and sends info quite slowly. Also it
doesn't have to be strictly real-time... it can lag by a few seconds and
still be quite acceptable.
> Instrumental voice potentially allows some very interesting effects
>such as shifting tone to make a man/woman change gender without sounding
>squeaky (Interestingly this can be done by shifting a few parameters in GSM
>encodings as well), shifting sample set to take on a different voice (though
>it will not be the same voice as the original user, it will be closer than
>your own sample set), etceteras.
This is a very interesting idea that hadn't occurred to me. I could finally
talk with a Donald Duck voice. :-) Or like Astro from the Jetsons. heheh
...just by substituting a different voice-texture for my own.
> Note again: you really need an adaptive system, not just a single
>sample set... What happens when my dog barks or I want to share the music to
>which I'm listening?
Sounds like footsteps, slamming doors, birds, music, dog barks, cars, etc
would largely be ignored, being mostly outside the area that describe the
phonemes as only a small number of frequencies are being sampled... just
enough to distinguish between vocal sounds. It might muck up the sound a
little momentarily, but not as badly as RealAudio when bandwidth gets
tight... and it would only be momentary. It is easy to ask a person to
repeat what they said... or work it out from context.
I shall have a look at coding a single voice-texture system and see if I
can get it to work. If I have success with that I may try a continual or
adaptive sampling system. But nobody expect it tomorrow... I have a lot of
things to do at the moment.
Best wishes,
- Miriam
-----------------------------------------------------------------
http://werple.net.au/~miriam/
Virtual Reality Association (VRA)
Melbourne, Australia
http://www.vr.org.au/