Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-04 Thread Karim Beyrouti
Yeh - not sure this will help

however - a (very talented) colleague of mine worked on a simple speech 
recognition software for mobile - it was built to recognise about 20 commands 
with 90% success rate.

His approach (in my simplistic terms) was:

1) get recordings / audio samples of the commands (in your case vowels - it 
should be easier as it's generated so you wont have to compare against too 
many/different intonations ) - 
2) create / store a graph of the audio commands ( this used FFT (s) - to 
abstract and simplify, the pattern of the commands - the result was a square 
voice print graph )
3) The stored patterns/voiceprints were then compared against the users voice 
recording. 

The trickiest part of this whole business were the Fast Fourier Transforms - 
these things get very complicated, and confuse the life out of me. Anyway, 
hopefully this
will help you - seems like it might be the best approach. if you do crack it - 
you will end up with a simple voice recognition system. Which would be a 
brilliant and useful thing bit of code to
have...

hope this was of any use..

- karim

On 4 Jun 2010, at 01:23, Karl DeSaulniers wrote:

 I would try using that to figure out a way of maping the sounds and then 
 translate that to your project. You are able to see the wave forms in 
 soundbooth? Haven't used it. If so, can you run your cursor over it at any 
 point to get the readings? Might be a little trivial, but may yeild a pattern 
 that you can utilize.
 
 JAT
 
 Karl
 
 Sent from losPhone
 
 On Jun 3, 2010, at 6:18 PM, Eric E. Dolecki edole...@gmail.com wrote:
 
 SoundBooth
 
 On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers k...@designdrumm.comwrote:
 
 Do you have SoundEdit? Or the like?
 
 
 Karl
 
 
 
 On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote:
 
 I think I might make waveform bitmaps and then try and compare against the
 current waveform (block EQ) - and if it's a close match, then fire off
 specific vowel events. If that works, I could do consonants too. If this
 works, I'll do jumping jacks and shots of Jack.
 
 So how would I compare two bitmaps to see if a waveform (
 On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers k...@designdrumm.com
 wrote:
 
 If you need any of these files or can't find them, lmk and I can send off
 list.
 
 Best,
 
 Karl
 
 
 
 On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote:
 
 Don't know if this will help, but have you looked into WaveAnalyzer.as
 or
 
 Flash MX - Audio: Sound completion event (The source files for this can
 be
 found in the Flash MX/Samples folder.)
 They both let you control the sound. I am thinking this will point you
 in
 a good direction. Its AS2 though.
 
 HTH,
 
 Karl
 
 
 On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote:
 
 Ya - I have the data for both things, but they extend over time and are
 
 difficult to compare. It's the boiling down the signatures into
 something
 simple and being able to read the playing audio looking for the match
 (or
 near match). I thought about using bitmap data and trying to match up
 waveforms, etc. but I don't know enough about it to pull that off. It
 seems
 like a hack in a way, but if it worked, who cares I suppose.
 
 On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano 
 califa010.flashcod...@gmail.com wrote:
 
 
 
 I'm not Henrik, but I've done some lip-synch stuff for Disney. We
 did
 it pretty much the way Eric described--we just used amplitude. It's
 not as accurate as Disney would demand on a film, but it's ok in the
 kids' game market.
 
 
 
 I see, amplitudes could be just good enough for some stuff.
 
 Although the speed and the intensitiy of the speech could give
 misleading
 results, I think. I'm under the impression that you should somehow try
 to
 compare the shape of the waves (somehow simplifiy your input to some
 value
 of sets of values that are easier to compare, possibly in a time
 window)
 and compare it in some meaningful way to precalculated samples to find
 a
 matching pattern. That's the part I have no clue about!
 
 Cheers
 Juan Pablo Califano
 
 2010/6/3 Kerry Thompson al...@cyberiantiger.biz
 
 Juan Pablo Califano wrote:
 
 
 Wow. That was really uncalled for.
 
 
 
 That was my reaction, too. I didn't see Eric as complaining--just
 asking. Maybe Henrik was just having a bad day.
 
 For me, the hard part, which you seem to imply is rather simple
 here,
 
 
 is
 
 
 *matching+ the input audio against said profiles. Admitedly, I don't
 
 
 know
 
 
 anything about digital signal processing and audio programming in
 
 
 general,
 
 but matching sounds a bit vague. Perhaps you could enlighten us, I
 
 you
 
 
 feel like.
 
 
 
 I'm not Henrik, but I've done some lip-synch stuff for Disney. We did
 it pretty much the way Eric described--we just used amplitude. It's
 not as accurate as Disney would demand on a film, but it's ok in the
 kids' game market.
 
 Doing something more accurate would probably involve at least 6 mouth
 positions, and if you're doing it in real 

Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-04 Thread Eric E. Dolecki
I've started implementing some code this morning in the hopes to match the
vowel a this morning. Of course there are several intonations for this
depending on the word it's located in, but if I can get a match on a naked
a I may be on to something. Like you said, I have a higher chance of
success since the voice is software generated and not from random people's
speech patterns.

If I don't get something today I'm going to bail on the engine in the hopes
of finding something useful some other time. This isn't a critical feature
for me as I have the jaw moving with precision and the effect comes across.
Mouth shapes would be the icing on the cake.

Eric

On Fri, Jun 4, 2010 at 8:34 AM, Karim Beyrouti ka...@kurst.co.uk wrote:

 Yeh - not sure this will help

 however - a (very talented) colleague of mine worked on a simple speech
 recognition software for mobile - it was built to recognise about 20
 commands with 90% success rate.

 His approach (in my simplistic terms) was:

 1) get recordings / audio samples of the commands (in your case vowels - it
 should be easier as it's generated so you wont have to compare against too
 many/different intonations ) -
 2) create / store a graph of the audio commands ( this used FFT (s) - to
 abstract and simplify, the pattern of the commands - the result was a square
 voice print graph )
 3) The stored patterns/voiceprints were then compared against the users
 voice recording.

 The trickiest part of this whole business were the Fast Fourier Transforms
 - these things get very complicated, and confuse the life out of me. Anyway,
 hopefully this
 will help you - seems like it might be the best approach. if you do crack
 it - you will end up with a simple voice recognition system. Which would be
 a brilliant and useful thing bit of code to
 have...

 hope this was of any use..

 - karim

 On 4 Jun 2010, at 01:23, Karl DeSaulniers wrote:

  I would try using that to figure out a way of maping the sounds and then
 translate that to your project. You are able to see the wave forms in
 soundbooth? Haven't used it. If so, can you run your cursor over it at any
 point to get the readings? Might be a little trivial, but may yeild a
 pattern that you can utilize.
 
  JAT
 
  Karl
 
  Sent from losPhone
 
  On Jun 3, 2010, at 6:18 PM, Eric E. Dolecki edole...@gmail.com
 wrote:
 
  SoundBooth
 
  On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers k...@designdrumm.com
 wrote:
 
  Do you have SoundEdit? Or the like?
 
 
  Karl
 
 
 
  On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote:
 
  I think I might make waveform bitmaps and then try and compare against
 the
  current waveform (block EQ) - and if it's a close match, then fire off
  specific vowel events. If that works, I could do consonants too. If
 this
  works, I'll do jumping jacks and shots of Jack.
 
  So how would I compare two bitmaps to see if a waveform (
  On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers 
 k...@designdrumm.com
  wrote:
 
  If you need any of these files or can't find them, lmk and I can send
 off
  list.
 
  Best,
 
  Karl
 
 
 
  On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote:
 
  Don't know if this will help, but have you looked into
 WaveAnalyzer.as
  or
 
  Flash MX - Audio: Sound completion event (The source files for this
 can
  be
  found in the Flash MX/Samples folder.)
  They both let you control the sound. I am thinking this will point
 you
  in
  a good direction. Its AS2 though.
 
  HTH,
 
  Karl
 
 
  On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote:
 
  Ya - I have the data for both things, but they extend over time and
 are
 
  difficult to compare. It's the boiling down the signatures into
  something
  simple and being able to read the playing audio looking for the
 match
  (or
  near match). I thought about using bitmap data and trying to match
 up
  waveforms, etc. but I don't know enough about it to pull that off.
 It
  seems
  like a hack in a way, but if it worked, who cares I suppose.
 
  On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano 
  califa010.flashcod...@gmail.com wrote:
 
 
 
  I'm not Henrik, but I've done some lip-synch stuff for Disney.
 We
  did
  it pretty much the way Eric described--we just used amplitude.
 It's
  not as accurate as Disney would demand on a film, but it's ok in
 the
  kids' game market.
 
 
 
  I see, amplitudes could be just good enough for some stuff.
 
  Although the speed and the intensitiy of the speech could give
  misleading
  results, I think. I'm under the impression that you should somehow
 try
  to
  compare the shape of the waves (somehow simplifiy your input to
 some
  value
  of sets of values that are easier to compare, possibly in a time
  window)
  and compare it in some meaningful way to precalculated samples to
 find
  a
  matching pattern. That's the part I have no clue about!
 
  Cheers
  Juan Pablo Califano
 
  2010/6/3 Kerry Thompson al...@cyberiantiger.biz
 
  Juan Pablo Califano wrote:
 
 
  Wow. That was really 

Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-04 Thread Eric E. Dolecki
I can get waveforms... but say a takes 1 second to speak. I get different
waveforms over that 1 second... so I'm not matching against a single
waveform, but many waveforms in succession. This seems like a tricky thing
to match against.

What might be a good approach to matching values over a certain amount of
time? Is AS3 fast enough to sync quick enough? I imagine it would need to
check for all vowels every frame matching values in waveforms over a certain
amount of time.

Eric

On Fri, Jun 4, 2010 at 8:56 AM, Eric E. Dolecki edole...@gmail.com wrote:

 I've started implementing some code this morning in the hopes to match the
 vowel a this morning. Of course there are several intonations for this
 depending on the word it's located in, but if I can get a match on a naked
 a I may be on to something. Like you said, I have a higher chance of
 success since the voice is software generated and not from random people's
 speech patterns.

 If I don't get something today I'm going to bail on the engine in the hopes
 of finding something useful some other time. This isn't a critical feature
 for me as I have the jaw moving with precision and the effect comes across.
 Mouth shapes would be the icing on the cake.

 Eric


 On Fri, Jun 4, 2010 at 8:34 AM, Karim Beyrouti ka...@kurst.co.uk wrote:

 Yeh - not sure this will help

 however - a (very talented) colleague of mine worked on a simple speech
 recognition software for mobile - it was built to recognise about 20
 commands with 90% success rate.

 His approach (in my simplistic terms) was:

 1) get recordings / audio samples of the commands (in your case vowels -
 it should be easier as it's generated so you wont have to compare against
 too many/different intonations ) -
 2) create / store a graph of the audio commands ( this used FFT (s) - to
 abstract and simplify, the pattern of the commands - the result was a square
 voice print graph )
 3) The stored patterns/voiceprints were then compared against the users
 voice recording.

 The trickiest part of this whole business were the Fast Fourier Transforms
 - these things get very complicated, and confuse the life out of me. Anyway,
 hopefully this
 will help you - seems like it might be the best approach. if you do crack
 it - you will end up with a simple voice recognition system. Which would be
 a brilliant and useful thing bit of code to
 have...

 hope this was of any use..

 - karim

 On 4 Jun 2010, at 01:23, Karl DeSaulniers wrote:

  I would try using that to figure out a way of maping the sounds and then
 translate that to your project. You are able to see the wave forms in
 soundbooth? Haven't used it. If so, can you run your cursor over it at any
 point to get the readings? Might be a little trivial, but may yeild a
 pattern that you can utilize.
 
  JAT
 
  Karl
 
  Sent from losPhone
 
  On Jun 3, 2010, at 6:18 PM, Eric E. Dolecki edole...@gmail.com
 wrote:
 
  SoundBooth
 
  On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers k...@designdrumm.com
 wrote:
 
  Do you have SoundEdit? Or the like?
 
 
  Karl
 
 
 
  On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote:
 
  I think I might make waveform bitmaps and then try and compare against
 the
  current waveform (block EQ) - and if it's a close match, then fire
 off
  specific vowel events. If that works, I could do consonants too. If
 this
  works, I'll do jumping jacks and shots of Jack.
 
  So how would I compare two bitmaps to see if a waveform (
  On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers 
 k...@designdrumm.com
  wrote:
 
  If you need any of these files or can't find them, lmk and I can send
 off
  list.
 
  Best,
 
  Karl
 
 
 
  On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote:
 
  Don't know if this will help, but have you looked into
 WaveAnalyzer.as
  or
 
  Flash MX - Audio: Sound completion event (The source files for this
 can
  be
  found in the Flash MX/Samples folder.)
  They both let you control the sound. I am thinking this will point
 you
  in
  a good direction. Its AS2 though.
 
  HTH,
 
  Karl
 
 
  On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote:
 
  Ya - I have the data for both things, but they extend over time and
 are
 
  difficult to compare. It's the boiling down the signatures into
  something
  simple and being able to read the playing audio looking for the
 match
  (or
  near match). I thought about using bitmap data and trying to match
 up
  waveforms, etc. but I don't know enough about it to pull that off.
 It
  seems
  like a hack in a way, but if it worked, who cares I suppose.
 
  On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano 
  califa010.flashcod...@gmail.com wrote:
 
 
 
  I'm not Henrik, but I've done some lip-synch stuff for Disney.
 We
  did
  it pretty much the way Eric described--we just used amplitude.
 It's
  not as accurate as Disney would demand on a film, but it's ok in
 the
  kids' game market.
 
 
 
  I see, amplitudes could be just good enough for some stuff.
 
  Although the speed 

Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-04 Thread Eric E. Dolecki
I was able to match a single a - although even with a straight a there
can be some subtle variation. So I  mapped variations that come close and I
don't need to match every value in the complete waveform over time... every
couple together or even the first value with buffer comes pretty close. this
is with a known, unchanging vocal waveform. So I doubt this would be very
useful outside of this current system, which is a bummer.

I think it's time for me to retire this code and move on. Oh well...

Eric


On Fri, Jun 4, 2010 at 9:28 AM, Eric E. Dolecki edole...@gmail.com wrote:

 I can get waveforms... but say a takes 1 second to speak. I get different
 waveforms over that 1 second... so I'm not matching against a single
 waveform, but many waveforms in succession. This seems like a tricky thing
 to match against.

 What might be a good approach to matching values over a certain amount of
 time? Is AS3 fast enough to sync quick enough? I imagine it would need to
 check for all vowels every frame matching values in waveforms over a certain
 amount of time.

 Eric


 On Fri, Jun 4, 2010 at 8:56 AM, Eric E. Dolecki edole...@gmail.comwrote:

 I've started implementing some code this morning in the hopes to match the
 vowel a this morning. Of course there are several intonations for this
 depending on the word it's located in, but if I can get a match on a naked
 a I may be on to something. Like you said, I have a higher chance of
 success since the voice is software generated and not from random people's
 speech patterns.

 If I don't get something today I'm going to bail on the engine in the
 hopes of finding something useful some other time. This isn't a critical
 feature for me as I have the jaw moving with precision and the effect comes
 across. Mouth shapes would be the icing on the cake.

 Eric


 On Fri, Jun 4, 2010 at 8:34 AM, Karim Beyrouti ka...@kurst.co.uk wrote:

 Yeh - not sure this will help

 however - a (very talented) colleague of mine worked on a simple speech
 recognition software for mobile - it was built to recognise about 20
 commands with 90% success rate.

 His approach (in my simplistic terms) was:

 1) get recordings / audio samples of the commands (in your case vowels -
 it should be easier as it's generated so you wont have to compare against
 too many/different intonations ) -
 2) create / store a graph of the audio commands ( this used FFT (s) - to
 abstract and simplify, the pattern of the commands - the result was a square
 voice print graph )
 3) The stored patterns/voiceprints were then compared against the users
 voice recording.

 The trickiest part of this whole business were the Fast Fourier
 Transforms - these things get very complicated, and confuse the life out of
 me. Anyway, hopefully this
 will help you - seems like it might be the best approach. if you do crack
 it - you will end up with a simple voice recognition system. Which would be
 a brilliant and useful thing bit of code to
 have...

 hope this was of any use..

 - karim

 On 4 Jun 2010, at 01:23, Karl DeSaulniers wrote:

  I would try using that to figure out a way of maping the sounds and
 then translate that to your project. You are able to see the wave forms in
 soundbooth? Haven't used it. If so, can you run your cursor over it at any
 point to get the readings? Might be a little trivial, but may yeild a
 pattern that you can utilize.
 
  JAT
 
  Karl
 
  Sent from losPhone
 
  On Jun 3, 2010, at 6:18 PM, Eric E. Dolecki edole...@gmail.com
 wrote:
 
  SoundBooth
 
  On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers 
 k...@designdrumm.comwrote:
 
  Do you have SoundEdit? Or the like?
 
 
  Karl
 
 
 
  On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote:
 
  I think I might make waveform bitmaps and then try and compare
 against the
  current waveform (block EQ) - and if it's a close match, then fire
 off
  specific vowel events. If that works, I could do consonants too. If
 this
  works, I'll do jumping jacks and shots of Jack.
 
  So how would I compare two bitmaps to see if a waveform (
  On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers 
 k...@designdrumm.com
  wrote:
 
  If you need any of these files or can't find them, lmk and I can
 send off
  list.
 
  Best,
 
  Karl
 
 
 
  On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote:
 
  Don't know if this will help, but have you looked into
 WaveAnalyzer.as
  or
 
  Flash MX - Audio: Sound completion event (The source files for
 this can
  be
  found in the Flash MX/Samples folder.)
  They both let you control the sound. I am thinking this will point
 you
  in
  a good direction. Its AS2 though.
 
  HTH,
 
  Karl
 
 
  On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote:
 
  Ya - I have the data for both things, but they extend over time
 and are
 
  difficult to compare. It's the boiling down the signatures into
  something
  simple and being able to read the playing audio looking for the
 match
  (or
  near match). I thought about using bitmap data and 

Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-04 Thread Karl DeSaulniers
I would say there are about 5 - 7 mouth shapes you could distribute  
through your animation that would give the impression that the avatar  
is saying the right words.
Plus if your animation is fluid (meaning it doesn't look like the  
avatar is straining to say the words) it probably wont be noticeable  
if it mouths the wrong word from time to time.


JAT

Karl


On Jun 4, 2010, at 12:25 PM, Eric E. Dolecki wrote:

I was able to match a single a - although even with a straight  
a there
can be some subtle variation. So I  mapped variations that come  
close and I
don't need to match every value in the complete waveform over  
time... every
couple together or even the first value with buffer comes pretty  
close. this
is with a known, unchanging vocal waveform. So I doubt this would  
be very

useful outside of this current system, which is a bummer.

I think it's time for me to retire this code and move on. Oh well...

Eric


On Fri, Jun 4, 2010 at 9:28 AM, Eric E. Dolecki  
edole...@gmail.com wrote:


I can get waveforms... but say a takes 1 second to speak. I get  
different

waveforms over that 1 second... so I'm not matching against a single
waveform, but many waveforms in succession. This seems like a  
tricky thing

to match against.

What might be a good approach to matching values over a certain  
amount of
time? Is AS3 fast enough to sync quick enough? I imagine it would  
need to
check for all vowels every frame matching values in waveforms over  
a certain

amount of time.

Eric


On Fri, Jun 4, 2010 at 8:56 AM, Eric E. Dolecki  
edole...@gmail.comwrote:


I've started implementing some code this morning in the hopes to  
match the
vowel a this morning. Of course there are several intonations  
for this
depending on the word it's located in, but if I can get a match  
on a naked
a I may be on to something. Like you said, I have a higher  
chance of
success since the voice is software generated and not from random  
people's

speech patterns.

If I don't get something today I'm going to bail on the engine in  
the
hopes of finding something useful some other time. This isn't a  
critical
feature for me as I have the jaw moving with precision and the  
effect comes

across. Mouth shapes would be the icing on the cake.

Eric


On Fri, Jun 4, 2010 at 8:34 AM, Karim Beyrouti  
ka...@kurst.co.uk wrote:



Yeh - not sure this will help

however - a (very talented) colleague of mine worked on a simple  
speech
recognition software for mobile - it was built to recognise  
about 20

commands with 90% success rate.

His approach (in my simplistic terms) was:

1) get recordings / audio samples of the commands (in your case  
vowels -
it should be easier as it's generated so you wont have to  
compare against

too many/different intonations ) -
2) create / store a graph of the audio commands ( this used FFT  
(s) - to
abstract and simplify, the pattern of the commands - the result  
was a square

voice print graph )
3) The stored patterns/voiceprints were then compared against  
the users

voice recording.

The trickiest part of this whole business were the Fast Fourier
Transforms - these things get very complicated, and confuse the  
life out of

me. Anyway, hopefully this
will help you - seems like it might be the best approach. if you  
do crack
it - you will end up with a simple voice recognition system.  
Which would be

a brilliant and useful thing bit of code to
have...

hope this was of any use..

- karim

On 4 Jun 2010, at 01:23, Karl DeSaulniers wrote:

I would try using that to figure out a way of maping the sounds  
and
then translate that to your project. You are able to see the  
wave forms in
soundbooth? Haven't used it. If so, can you run your cursor over  
it at any
point to get the readings? Might be a little trivial, but may  
yeild a

pattern that you can utilize.


JAT

Karl

Sent from losPhone

On Jun 3, 2010, at 6:18 PM, Eric E. Dolecki edole...@gmail.com

wrote:



SoundBooth

On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers 

k...@designdrumm.comwrote:



Do you have SoundEdit? Or the like?


Karl



On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote:

I think I might make waveform bitmaps and then try and compare

against the
current waveform (block EQ) - and if it's a close match,  
then fire

off
specific vowel events. If that works, I could do consonants  
too. If

this

works, I'll do jumping jacks and shots of Jack.

So how would I compare two bitmaps to see if a waveform (
On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers 

k...@designdrumm.com

wrote:


If you need any of these files or can't find them, lmk and I  
can

send off

list.

Best,

Karl



On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote:

Don't know if this will help, but have you looked into

WaveAnalyzer.as

or

Flash MX - Audio: Sound completion event (The source files  
for

this can

be
found in the Flash MX/Samples folder.)
They both let you control the sound. I am thinking this  
will point

you

in
a 

Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Glen Pike
If your mp3's are pre-recorded rather than people recording them 
dynamically, could you use cue points?


On 02/06/2010 20:57, Eric E. Dolecki wrote:

I have a face that uses computeSpectrum in order to sync a mouth with
dynamic vocal-only MP3s... it works, but works much like a robot mouth. The
jaw animates by certain amounts based on volume.

I am trying to somehow get vowel approximations so that I can fire off some
events to update the mouth UI. Does anyone have any kind of algo that can
somehow get close enough readings from audio to detect vowels? Anything I
can do besides random to adjust the mouth shape will go miles in making my
face look more realistic.

Thanks for any insights.

Eric
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


   


___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Karl DeSaulniers
You could try matching say a lowered jaw with low octaves and a  
cheeky jaw with high octaves.

JAT


Karl

On Jun 2, 2010, at 3:20 PM, Eric E. Dolecki wrote:

This is a software voice, so nailing down vowels should be easier.  
However
you mention matching recordings with the live data. What is being  
matched?
Some kind of pattern I suppose. What form would the pattern take?  
How long

of a sample should be checked continuously, etc.?

It's a big topic. I understand your concept of how to do it, but I  
don't

have the technical expertise or foundation to implement the idea yet.

Eric


On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson  
he...@henke37.cjb.netwrote:



Eric E. Dolecki wrote:

I have a face that uses computeSpectrum in order to sync a mouth  
with
dynamic vocal-only MP3s... it works, but works much like a robot  
mouth.

The
jaw animates by certain amounts based on volume.

I am trying to somehow get vowel approximations so that I can  
fire off

some
events to update the mouth UI. Does anyone have any kind of algo  
that can
somehow get close enough readings from audio to detect vowels?  
Anything I
can do besides random to adjust the mouth shape will go miles in  
making my

face look more realistic.


You really just need to collect profiles to match against. Record  
people
saying stuff and match the recordings with the live data. When  
they match,

you know what the vocal is saying.
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders





--
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Karl DeSaulniers
Design Drumm
http://designdrumm.com

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Eric E. Dolecki
It's using dynamic text to speech, so I wouldn't be able to use cue points
reliably.

On Thu, Jun 3, 2010 at 4:09 AM, Glen Pike g...@engineeredarts.co.uk wrote:

 If your mp3's are pre-recorded rather than people recording them
 dynamically, could you use cue points?


 On 02/06/2010 20:57, Eric E. Dolecki wrote:

 I have a face that uses computeSpectrum in order to sync a mouth with
 dynamic vocal-only MP3s... it works, but works much like a robot mouth.
 The
 jaw animates by certain amounts based on volume.

 I am trying to somehow get vowel approximations so that I can fire off
 some
 events to update the mouth UI. Does anyone have any kind of algo that can
 somehow get close enough readings from audio to detect vowels? Anything I
 can do besides random to adjust the mouth shape will go miles in making my
 face look more realistic.

 Thanks for any insights.

 Eric
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders





 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




-- 
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Eric E. Dolecki
I don't think that's enough. Has anyone seen pitch detection in AS3 yet (no
microphone source)? That might be enough but I'm not sure.

On Thu, Jun 3, 2010 at 5:55 AM, Karl DeSaulniers k...@designdrumm.comwrote:

 You could try matching say a lowered jaw with low octaves and a cheeky jaw
 with high octaves.
 JAT


 Karl


 On Jun 2, 2010, at 3:20 PM, Eric E. Dolecki wrote:

  This is a software voice, so nailing down vowels should be easier. However
 you mention matching recordings with the live data. What is being matched?
 Some kind of pattern I suppose. What form would the pattern take? How long
 of a sample should be checked continuously, etc.?

 It's a big topic. I understand your concept of how to do it, but I don't
 have the technical expertise or foundation to implement the idea yet.

 Eric


 On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson he...@henke37.cjb.net
 wrote:

  Eric E. Dolecki wrote:

  I have a face that uses computeSpectrum in order to sync a mouth with
 dynamic vocal-only MP3s... it works, but works much like a robot mouth.
 The
 jaw animates by certain amounts based on volume.

 I am trying to somehow get vowel approximations so that I can fire off
 some
 events to update the mouth UI. Does anyone have any kind of algo that
 can
 somehow get close enough readings from audio to detect vowels? Anything
 I
 can do besides random to adjust the mouth shape will go miles in making
 my
 face look more realistic.


  You really just need to collect profiles to match against. Record
 people
 saying stuff and match the recordings with the live data. When they
 match,
 you know what the vocal is saying.
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




 --
 http://ericd.net
 Interactive design and development
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


 Karl DeSaulniers
 Design Drumm
 http://designdrumm.com


 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




-- 
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Henrik Andersson

Eric E. Dolecki wrote:

It's using dynamic text to speech, so I wouldn't be able to use cue points
reliably.



Use dynamic cuepoints and stop complaining. If it can generate voice, it 
can tell you what kinds of voice it put where. It is far more exact than 
trying to reverse the incredibly lossy transformation that the synthesis is.

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Eric E. Dolecki
I've tried running software voice vowels through the system and I am able to
create signatures for the vowels that's somewhat accurate (depending on
how it's influenced in a word or if it's standalone). I've run them several
times and my values always seem to match (which is good). I end up with a
very long stream of numbers for a signature because of the enter frame. I am
wondering what the best way to compare the currents to over a period of time
to match known values might be. What's a fast/best lookup means to check
against?

For instance, a spoken A for me looks like this:

speech loaded
0.16304096207022667
0.16304096207022667
0.16304096207022667
0.16304096207022667
0.4167095571756363
1.840158924460411
1.840158924460411
2.3130274564027786
2.7141911536455154
2.7141911536455154
5.49285389482975
8.781380131840706
9.142853170633316
9.142853170633316
... TONS more data...




On Thu, Jun 3, 2010 at 8:23 AM, Eric E. Dolecki edole...@gmail.com wrote:

 I don't think that's enough. Has anyone seen pitch detection in AS3 yet (no
 microphone source)? That might be enough but I'm not sure.


 On Thu, Jun 3, 2010 at 5:55 AM, Karl DeSaulniers k...@designdrumm.comwrote:

 You could try matching say a lowered jaw with low octaves and a cheeky jaw
 with high octaves.
 JAT


 Karl


 On Jun 2, 2010, at 3:20 PM, Eric E. Dolecki wrote:

  This is a software voice, so nailing down vowels should be easier.
 However
 you mention matching recordings with the live data. What is being
 matched?
 Some kind of pattern I suppose. What form would the pattern take? How
 long
 of a sample should be checked continuously, etc.?

 It's a big topic. I understand your concept of how to do it, but I don't
 have the technical expertise or foundation to implement the idea yet.

 Eric


 On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson he...@henke37.cjb.net
 wrote:

  Eric E. Dolecki wrote:

  I have a face that uses computeSpectrum in order to sync a mouth with
 dynamic vocal-only MP3s... it works, but works much like a robot mouth.
 The
 jaw animates by certain amounts based on volume.

 I am trying to somehow get vowel approximations so that I can fire off
 some
 events to update the mouth UI. Does anyone have any kind of algo that
 can
 somehow get close enough readings from audio to detect vowels? Anything
 I
 can do besides random to adjust the mouth shape will go miles in making
 my
 face look more realistic.


  You really just need to collect profiles to match against. Record
 people
 saying stuff and match the recordings with the live data. When they
 match,
 you know what the vocal is saying.
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




 --
 http://ericd.net
 Interactive design and development
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


 Karl DeSaulniers
 Design Drumm
 http://designdrumm.com


 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




 --
 http://ericd.net
 Interactive design and development




-- 
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Eric E. Dolecki
My most humble apologies go out to Henrik and anyone else who felt that I
was complaining about something. Which I wasn't.

On Thu, Jun 3, 2010 at 10:02 AM, Henrik Andersson he...@henke37.cjb.netwrote:

 Eric E. Dolecki wrote:

 It's using dynamic text to speech, so I wouldn't be able to use cue points
 reliably.


 Use dynamic cuepoints and stop complaining. If it can generate voice, it
 can tell you what kinds of voice it put where. It is far more exact than
 trying to reverse the incredibly lossy transformation that the synthesis is.

 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




-- 
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


RE: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Merrill, Jason
My most humble apologies go out to Henrik and anyone else who felt that I was 
complaining about something. Which I wasn't.

I don't think you need to apologize, I didn't think you were complaining at all 
- just stating your view of how you see this technique working with your 
project.


Jason Merrill 

Instructional Technology Architect
Bank of  America  Global Learning 

Join the Bank of America Flash Platform Community  and visit our Instructional 
Technology Design Blog
(note: these are for Bank of America employees only)


___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Glen Pike
Have a look at this stuff, don't know if there is source, but it might 
be a start...


http://www.allflashwebsite.com/article/real-time-lip-sync-in-flash



On 03/06/2010 15:03, Eric E. Dolecki wrote:

I've tried running software voice vowels through the system and I am able to
create signatures for the vowels that's somewhat accurate (depending on
how it's influenced in a word or if it's standalone). I've run them several
times and my values always seem to match (which is good). I end up with a
very long stream of numbers for a signature because of the enter frame. I am
wondering what the best way to compare the currents to over a period of time
to match known values might be. What's a fast/best lookup means to check
against?

For instance, a spoken A for me looks like this:

speech loaded
0.16304096207022667
0.16304096207022667
0.16304096207022667
0.16304096207022667
0.4167095571756363
1.840158924460411
1.840158924460411
2.3130274564027786
2.7141911536455154
2.7141911536455154
5.49285389482975
8.781380131840706
9.142853170633316
9.142853170633316
... TONS more data...




On Thu, Jun 3, 2010 at 8:23 AM, Eric E. Doleckiedole...@gmail.com  wrote:

   

I don't think that's enough. Has anyone seen pitch detection in AS3 yet (no
microphone source)? That might be enough but I'm not sure.


On Thu, Jun 3, 2010 at 5:55 AM, Karl DeSaulniersk...@designdrumm.comwrote:

 

You could try matching say a lowered jaw with low octaves and a cheeky jaw
with high octaves.
JAT


Karl


On Jun 2, 2010, at 3:20 PM, Eric E. Dolecki wrote:

  This is a software voice, so nailing down vowels should be easier.
   

However
you mention matching recordings with the live data. What is being
matched?
Some kind of pattern I suppose. What form would the pattern take? How
long
of a sample should be checked continuously, etc.?

It's a big topic. I understand your concept of how to do it, but I don't
have the technical expertise or foundation to implement the idea yet.

Eric


On Wed, Jun 2, 2010 at 4:13 PM, Henrik Anderssonhe...@henke37.cjb.net
 

wrote:
   

  Eric E. Dolecki wrote:
 

  I have a face that uses computeSpectrum in order to sync a mouth with
   

dynamic vocal-only MP3s... it works, but works much like a robot mouth.
The
jaw animates by certain amounts based on volume.

I am trying to somehow get vowel approximations so that I can fire off
some
events to update the mouth UI. Does anyone have any kind of algo that
can
somehow get close enough readings from audio to detect vowels? Anything
I
can do besides random to adjust the mouth shape will go miles in making
my
face look more realistic.


  You really just need to collect profiles to match against. Record
 

people
saying stuff and match the recordings with the live data. When they
match,
you know what the vocal is saying.
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


   


--
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

 

Karl DeSaulniers
Design Drumm
http://designdrumm.com


___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

   



--
http://ericd.net
Interactive design and development

 



   


___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Juan Pablo Califano
Wow. That was really uncalled for.

Anyway, if you can pre-generate samples for all vowels for all samples, I
can't see why comparing them to the speech generated by the same system
would  be any harder than comparing it to a number of collected profiles.


You really just need to collect profiles to match against. Record people
saying stuff and match the recordings with the live data. When they match,
you know what the vocal is saying.


For me, the hard part, which you seem to imply is rather simple here, is
*matching+ the input audio against said profiles. Admitedly, I don't know
anything about digital signal processing and audio programming in general,
but matching sounds a bit vague. Perhaps you could enlighten us, I you
feel like.

Cheers
Juan Pablo Califano

2010/6/3 Henrik Andersson he...@henke37.cjb.net

 Eric E. Dolecki wrote:

 It's using dynamic text to speech, so I wouldn't be able to use cue points
 reliably.


 Use dynamic cuepoints and stop complaining. If it can generate voice, it
 can tell you what kinds of voice it put where. It is far more exact than
 trying to reverse the incredibly lossy transformation that the synthesis is.


 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Eric E. Dolecki
I'm abandoning the whole vowel recognition unless I can find something
someone else has done to base my implementation on. I've burnt too much time
on it for something that won't give a whole lot of bang for the buck. It's a
very complex problem (for me anyway).

Eric

On Thu, Jun 3, 2010 at 2:37 PM, Juan Pablo Califano 
califa010.flashcod...@gmail.com wrote:

 Wow. That was really uncalled for.

 Anyway, if you can pre-generate samples for all vowels for all samples, I
 can't see why comparing them to the speech generated by the same system
 would  be any harder than comparing it to a number of collected profiles.

 
 You really just need to collect profiles to match against. Record people
 saying stuff and match the recordings with the live data. When they match,
 you know what the vocal is saying.
 

 For me, the hard part, which you seem to imply is rather simple here, is
 *matching+ the input audio against said profiles. Admitedly, I don't know
 anything about digital signal processing and audio programming in general,
 but matching sounds a bit vague. Perhaps you could enlighten us, I you
 feel like.

 Cheers
 Juan Pablo Califano

 2010/6/3 Henrik Andersson he...@henke37.cjb.net

  Eric E. Dolecki wrote:
 
  It's using dynamic text to speech, so I wouldn't be able to use cue
 points
  reliably.
 
 
  Use dynamic cuepoints and stop complaining. If it can generate voice, it
  can tell you what kinds of voice it put where. It is far more exact than
  trying to reverse the incredibly lossy transformation that the synthesis
 is.
 
 
  ___
  Flashcoders mailing list
  Flashcoders@chattyfig.figleaf.com
  http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
 
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




-- 
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Henrik Andersson

Juan Pablo Califano wrote:

Wow. That was really uncalled for.



I meant no ill will. I just meant that there is a better solution than 
this idea.




For me, the hard part, which you seem to imply is rather simple here, is
*matching+ the input audio against said profiles. Admitedly, I don't know
anything about digital signal processing and audio programming in general,
but matching sounds a bit vague. Perhaps you could enlighten us, I you
feel like.


Since you asked, I actually don't know how to do it myself. I have 
honestly studied the subject, but I still don't know how to do it.

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Kerry Thompson
Juan Pablo Califano wrote:

 Wow. That was really uncalled for.

That was my reaction, too. I didn't see Eric as complaining--just
asking. Maybe Henrik was just having a bad day.

 For me, the hard part, which you seem to imply is rather simple here, is
 *matching+ the input audio against said profiles. Admitedly, I don't know
 anything about digital signal processing and audio programming in general,
 but matching sounds a bit vague. Perhaps you could enlighten us, I you
 feel like.

I'm not Henrik, but I've done some lip-synch stuff for Disney. We did
it pretty much the way Eric described--we just used amplitude. It's
not as accurate as Disney would demand on a film, but it's ok in the
kids' game market.

Doing something more accurate would probably involve at least 6 mouth
positions, and if you're doing it in real time, you'd have to do a
reverse FFT. It can be done--there was a really good commercial
lip-synch program that generated Action Script to control mouth
positions. I don't know if it's still around--that was 5 years ago,
and it was pretty expensive (about $2,500 for one seat, I think). It
may even have been a Director Xtra that worked with a Flash Sprite,
but let's not talk about Director :-P

Cordially,

Kerry Thompson
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


RE: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Merrill, Jason
 I meant no ill will. 
 I just meant that there is a better solution than this idea.

Ah, so that's what you meant by stop complaining.  I see, that makes
perfect sense now. ;)  


Jason Merrill 

Instructional Technology Architect
Bank of  America  Global Learning 

Join the Bank of America Flash Platform Community  and visit our
Instructional Technology Design Blog
(note: these are for Bank of America employees only)



___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


RE: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Merrill, Jason
there was a really good commercial lip-synch program that generated
 Action Script to control mouth positions.
 I don't know if it's still around--that was 5 years ago, 
and it was pretty expensive (about $2,500 for one seat, I think).
 It may even have 
been a Director Xtra that worked with a Flash Sprite

You're probably thinking of the Flash-based SitePal:
http://www.sitepal.com/   ?

We had licenses for a while - they dropped it after a while as we
discovered the audience thought they were annoying and ongoing license
fees too ridiculous.  They have gotten better - visually at least, but
having to keep paying to use technology like this is kinda stupid I
think.


Jason Merrill 

Instructional Technology Architect
Bank of  America  Global Learning 

Join the Bank of America Flash Platform Community  and visit our
Instructional Technology Design Blog
(note: these are for Bank of America employees only)


___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Juan Pablo Califano


I'm not Henrik, but I've done some lip-synch stuff for Disney. We did
it pretty much the way Eric described--we just used amplitude. It's
not as accurate as Disney would demand on a film, but it's ok in the
kids' game market.



I see, amplitudes could be just good enough for some stuff.

Although the speed and the intensitiy of the speech could give misleading
results, I think. I'm under the impression that you should somehow try to
compare the shape of the waves (somehow simplifiy your input to some value
of sets of values that are easier to compare, possibly in a time window)
and compare it in some meaningful way to precalculated samples to find a
matching pattern. That's the part I have no clue about!

Cheers
Juan Pablo Califano

2010/6/3 Kerry Thompson al...@cyberiantiger.biz

 Juan Pablo Califano wrote:

  Wow. That was really uncalled for.

 That was my reaction, too. I didn't see Eric as complaining--just
 asking. Maybe Henrik was just having a bad day.

  For me, the hard part, which you seem to imply is rather simple here, is
  *matching+ the input audio against said profiles. Admitedly, I don't know
  anything about digital signal processing and audio programming in
 general,
  but matching sounds a bit vague. Perhaps you could enlighten us, I you
  feel like.

 I'm not Henrik, but I've done some lip-synch stuff for Disney. We did
 it pretty much the way Eric described--we just used amplitude. It's
 not as accurate as Disney would demand on a film, but it's ok in the
 kids' game market.

 Doing something more accurate would probably involve at least 6 mouth
 positions, and if you're doing it in real time, you'd have to do a
 reverse FFT. It can be done--there was a really good commercial
 lip-synch program that generated Action Script to control mouth
 positions. I don't know if it's still around--that was 5 years ago,
 and it was pretty expensive (about $2,500 for one seat, I think). It
 may even have been a Director Xtra that worked with a Flash Sprite,
 but let's not talk about Director :-P

 Cordially,

 Kerry Thompson
  ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Eric E. Dolecki
Ya - I have the data for both things, but they extend over time and are
difficult to compare. It's the boiling down the signatures into something
simple and being able to read the playing audio looking for the match (or
near match). I thought about using bitmap data and trying to match up
waveforms, etc. but I don't know enough about it to pull that off. It seems
like a hack in a way, but if it worked, who cares I suppose.

On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano 
califa010.flashcod...@gmail.com wrote:

 

 I'm not Henrik, but I've done some lip-synch stuff for Disney. We did
 it pretty much the way Eric described--we just used amplitude. It's
 not as accurate as Disney would demand on a film, but it's ok in the
 kids' game market.

 

 I see, amplitudes could be just good enough for some stuff.

 Although the speed and the intensitiy of the speech could give misleading
 results, I think. I'm under the impression that you should somehow try to
 compare the shape of the waves (somehow simplifiy your input to some value
 of sets of values that are easier to compare, possibly in a time window)
 and compare it in some meaningful way to precalculated samples to find a
 matching pattern. That's the part I have no clue about!

 Cheers
 Juan Pablo Califano

 2010/6/3 Kerry Thompson al...@cyberiantiger.biz

  Juan Pablo Califano wrote:
 
   Wow. That was really uncalled for.
 
  That was my reaction, too. I didn't see Eric as complaining--just
  asking. Maybe Henrik was just having a bad day.
 
   For me, the hard part, which you seem to imply is rather simple here,
 is
   *matching+ the input audio against said profiles. Admitedly, I don't
 know
   anything about digital signal processing and audio programming in
  general,
   but matching sounds a bit vague. Perhaps you could enlighten us, I
 you
   feel like.
 
  I'm not Henrik, but I've done some lip-synch stuff for Disney. We did
  it pretty much the way Eric described--we just used amplitude. It's
  not as accurate as Disney would demand on a film, but it's ok in the
  kids' game market.
 
  Doing something more accurate would probably involve at least 6 mouth
  positions, and if you're doing it in real time, you'd have to do a
  reverse FFT. It can be done--there was a really good commercial
  lip-synch program that generated Action Script to control mouth
  positions. I don't know if it's still around--that was 5 years ago,
  and it was pretty expensive (about $2,500 for one seat, I think). It
  may even have been a Director Xtra that worked with a Flash Sprite,
  but let's not talk about Director :-P
 
  Cordially,
 
  Kerry Thompson
   ___
  Flashcoders mailing list
  Flashcoders@chattyfig.figleaf.com
  http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
 
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




-- 
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Kerry Thompson
Jason Merrill wrote:

 You're probably thinking of the Flash-based SitePal:
 http://www.sitepal.com/   ?

It could have been. I honestly don't remember--it was at least 5 years
ago. We considered using the software, but the studio head vetoed it
as too expensive, especially since we already had an Xtra, written in
C++, to measure amplitude. Basically, the higher the amplitude, the
more open the mouth was. I think we only used 3-4 mouth positions, and
it was good enough for Disney.

It was a series of games based on Disney Channel properties (or
cartoons, as they are known in the real world). I don't watch much
Disney Channel, but I suspect their lip synch isn't up to the same
standards as their movies.

Cordially,

Kerry Thompson

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Karl DeSaulniers
Don't know if this will help, but have you looked into  
WaveAnalyzer.as or
Flash MX - Audio: Sound completion event (The source files for this  
can be found in the Flash MX/Samples folder.)
They both let you control the sound. I am thinking this will point  
you in a good direction. Its AS2 though.


HTH,

Karl


On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote:

Ya - I have the data for both things, but they extend over time and  
are
difficult to compare. It's the boiling down the signatures into  
something
simple and being able to read the playing audio looking for the  
match (or

near match). I thought about using bitmap data and trying to match up
waveforms, etc. but I don't know enough about it to pull that off.  
It seems

like a hack in a way, but if it worked, who cares I suppose.

On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano 
califa010.flashcod...@gmail.com wrote:





I'm not Henrik, but I've done some lip-synch stuff for Disney. We did
it pretty much the way Eric described--we just used amplitude. It's
not as accurate as Disney would demand on a film, but it's ok in the
kids' game market.





I see, amplitudes could be just good enough for some stuff.

Although the speed and the intensitiy of the speech could give  
misleading
results, I think. I'm under the impression that you should somehow  
try to
compare the shape of the waves (somehow simplifiy your input to  
some value
of sets of values that are easier to compare, possibly in a time  
window)
and compare it in some meaningful way to precalculated samples to  
find a

matching pattern. That's the part I have no clue about!

Cheers
Juan Pablo Califano

2010/6/3 Kerry Thompson al...@cyberiantiger.biz


Juan Pablo Califano wrote:


Wow. That was really uncalled for.


That was my reaction, too. I didn't see Eric as complaining--just
asking. Maybe Henrik was just having a bad day.

For me, the hard part, which you seem to imply is rather simple  
here,

is
*matching+ the input audio against said profiles. Admitedly, I  
don't

know

anything about digital signal processing and audio programming in

general,
but matching sounds a bit vague. Perhaps you could enlighten  
us, I

you

feel like.


I'm not Henrik, but I've done some lip-synch stuff for Disney. We  
did

it pretty much the way Eric described--we just used amplitude. It's
not as accurate as Disney would demand on a film, but it's ok in the
kids' game market.

Doing something more accurate would probably involve at least 6  
mouth

positions, and if you're doing it in real time, you'd have to do a
reverse FFT. It can be done--there was a really good commercial
lip-synch program that generated Action Script to control mouth
positions. I don't know if it's still around--that was 5 years ago,
and it was pretty expensive (about $2,500 for one seat, I think). It
may even have been a Director Xtra that worked with a Flash Sprite,
but let's not talk about Director :-P

Cordially,

Kerry Thompson
 ___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders





--
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Karl DeSaulniers
Design Drumm
http://designdrumm.com

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Karl DeSaulniers
If you need any of these files or can't find them, lmk and I can send  
off list.


Best,

Karl


On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote:

Don't know if this will help, but have you looked into  
WaveAnalyzer.as or
Flash MX - Audio: Sound completion event (The source files for this  
can be found in the Flash MX/Samples folder.)
They both let you control the sound. I am thinking this will point  
you in a good direction. Its AS2 though.


HTH,

Karl


On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote:

Ya - I have the data for both things, but they extend over time  
and are
difficult to compare. It's the boiling down the signatures into  
something
simple and being able to read the playing audio looking for the  
match (or

near match). I thought about using bitmap data and trying to match up
waveforms, etc. but I don't know enough about it to pull that off.  
It seems

like a hack in a way, but if it worked, who cares I suppose.

On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano 
califa010.flashcod...@gmail.com wrote:





I'm not Henrik, but I've done some lip-synch stuff for Disney. We  
did

it pretty much the way Eric described--we just used amplitude. It's
not as accurate as Disney would demand on a film, but it's ok in the
kids' game market.





I see, amplitudes could be just good enough for some stuff.

Although the speed and the intensitiy of the speech could give  
misleading
results, I think. I'm under the impression that you should  
somehow try to
compare the shape of the waves (somehow simplifiy your input to  
some value
of sets of values that are easier to compare, possibly in a time  
window)
and compare it in some meaningful way to precalculated samples to  
find a

matching pattern. That's the part I have no clue about!

Cheers
Juan Pablo Califano

2010/6/3 Kerry Thompson al...@cyberiantiger.biz


Juan Pablo Califano wrote:


Wow. That was really uncalled for.


That was my reaction, too. I didn't see Eric as complaining--just
asking. Maybe Henrik was just having a bad day.

For me, the hard part, which you seem to imply is rather simple  
here,

is
*matching+ the input audio against said profiles. Admitedly, I  
don't

know

anything about digital signal processing and audio programming in

general,
but matching sounds a bit vague. Perhaps you could enlighten  
us, I

you

feel like.


I'm not Henrik, but I've done some lip-synch stuff for Disney.  
We did

it pretty much the way Eric described--we just used amplitude. It's
not as accurate as Disney would demand on a film, but it's ok in  
the

kids' game market.

Doing something more accurate would probably involve at least 6  
mouth

positions, and if you're doing it in real time, you'd have to do a
reverse FFT. It can be done--there was a really good commercial
lip-synch program that generated Action Script to control mouth
positions. I don't know if it's still around--that was 5 years ago,
and it was pretty expensive (about $2,500 for one seat, I  
think). It

may even have been a Director Xtra that worked with a Flash Sprite,
but let's not talk about Director :-P

Cordially,

Kerry Thompson
 ___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders





--
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Karl DeSaulniers
Design Drumm
http://designdrumm.com

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Karl DeSaulniers
Design Drumm
http://designdrumm.com

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Eric E. Dolecki
I think I might make waveform bitmaps and then try and compare against the
current waveform (block EQ) - and if it's a close match, then fire off
specific vowel events. If that works, I could do consonants too. If this
works, I'll do jumping jacks and shots of Jack.

So how would I compare two bitmaps to see if a waveform (
On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers k...@designdrumm.comwrote:

 If you need any of these files or can't find them, lmk and I can send off
 list.

 Best,

 Karl



 On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote:

  Don't know if this will help, but have you looked into WaveAnalyzer.as or
 Flash MX - Audio: Sound completion event (The source files for this can be
 found in the Flash MX/Samples folder.)
 They both let you control the sound. I am thinking this will point you in
 a good direction. Its AS2 though.

 HTH,

 Karl


 On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote:

  Ya - I have the data for both things, but they extend over time and are
 difficult to compare. It's the boiling down the signatures into something
 simple and being able to read the playing audio looking for the match (or
 near match). I thought about using bitmap data and trying to match up
 waveforms, etc. but I don't know enough about it to pull that off. It
 seems
 like a hack in a way, but if it worked, who cares I suppose.

 On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano 
 califa010.flashcod...@gmail.com wrote:



 I'm not Henrik, but I've done some lip-synch stuff for Disney. We did
 it pretty much the way Eric described--we just used amplitude. It's
 not as accurate as Disney would demand on a film, but it's ok in the
 kids' game market.



 I see, amplitudes could be just good enough for some stuff.

 Although the speed and the intensitiy of the speech could give
 misleading
 results, I think. I'm under the impression that you should somehow try
 to
 compare the shape of the waves (somehow simplifiy your input to some
 value
 of sets of values that are easier to compare, possibly in a time
 window)
 and compare it in some meaningful way to precalculated samples to find a
 matching pattern. That's the part I have no clue about!

 Cheers
 Juan Pablo Califano

 2010/6/3 Kerry Thompson al...@cyberiantiger.biz

  Juan Pablo Califano wrote:

  Wow. That was really uncalled for.


 That was my reaction, too. I didn't see Eric as complaining--just
 asking. Maybe Henrik was just having a bad day.

  For me, the hard part, which you seem to imply is rather simple here,

 is

 *matching+ the input audio against said profiles. Admitedly, I don't

 know

 anything about digital signal processing and audio programming in

 general,

 but matching sounds a bit vague. Perhaps you could enlighten us, I

 you

 feel like.


 I'm not Henrik, but I've done some lip-synch stuff for Disney. We did
 it pretty much the way Eric described--we just used amplitude. It's
 not as accurate as Disney would demand on a film, but it's ok in the
 kids' game market.

 Doing something more accurate would probably involve at least 6 mouth
 positions, and if you're doing it in real time, you'd have to do a
 reverse FFT. It can be done--there was a really good commercial
 lip-synch program that generated Action Script to control mouth
 positions. I don't know if it's still around--that was 5 years ago,
 and it was pretty expensive (about $2,500 for one seat, I think). It
 may even have been a Director Xtra that worked with a Flash Sprite,
 but let's not talk about Director :-P

 Cordially,

 Kerry Thompson
  ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

  ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




 --
 http://ericd.net
 Interactive design and development
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


 Karl DeSaulniers
 Design Drumm
 http://designdrumm.com

 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


 Karl DeSaulniers
 Design Drumm
 http://designdrumm.com

 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




-- 
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Karl DeSaulniers

Do you have SoundEdit? Or the like?


Karl


On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote:

I think I might make waveform bitmaps and then try and compare  
against the

current waveform (block EQ) - and if it's a close match, then fire off
specific vowel events. If that works, I could do consonants too. If  
this

works, I'll do jumping jacks and shots of Jack.

So how would I compare two bitmaps to see if a waveform (
On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers  
k...@designdrumm.comwrote:


If you need any of these files or can't find them, lmk and I can  
send off

list.

Best,

Karl



On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote:

 Don't know if this will help, but have you looked into  
WaveAnalyzer.as or
Flash MX - Audio: Sound completion event (The source files for  
this can be

found in the Flash MX/Samples folder.)
They both let you control the sound. I am thinking this will  
point you in

a good direction. Its AS2 though.

HTH,

Karl


On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote:

 Ya - I have the data for both things, but they extend over time  
and are
difficult to compare. It's the boiling down the signatures into  
something
simple and being able to read the playing audio looking for the  
match (or
near match). I thought about using bitmap data and trying to  
match up
waveforms, etc. but I don't know enough about it to pull that  
off. It

seems
like a hack in a way, but if it worked, who cares I suppose.

On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano 
califa010.flashcod...@gmail.com wrote:




I'm not Henrik, but I've done some lip-synch stuff for Disney.  
We did
it pretty much the way Eric described--we just used amplitude.  
It's
not as accurate as Disney would demand on a film, but it's ok  
in the

kids' game market.





I see, amplitudes could be just good enough for some stuff.

Although the speed and the intensitiy of the speech could give
misleading
results, I think. I'm under the impression that you should  
somehow try

to
compare the shape of the waves (somehow simplifiy your input to  
some

value
of sets of values that are easier to compare, possibly in a time
window)
and compare it in some meaningful way to precalculated samples  
to find a

matching pattern. That's the part I have no clue about!

Cheers
Juan Pablo Califano

2010/6/3 Kerry Thompson al...@cyberiantiger.biz

 Juan Pablo Califano wrote:


 Wow. That was really uncalled for.




That was my reaction, too. I didn't see Eric as complaining--just
asking. Maybe Henrik was just having a bad day.

 For me, the hard part, which you seem to imply is rather  
simple here,



is


*matching+ the input audio against said profiles. Admitedly, I  
don't



know



anything about digital signal processing and audio programming in



general,

but matching sounds a bit vague. Perhaps you could  
enlighten us, I



you



feel like.




I'm not Henrik, but I've done some lip-synch stuff for Disney.  
We did
it pretty much the way Eric described--we just used amplitude.  
It's
not as accurate as Disney would demand on a film, but it's ok  
in the

kids' game market.

Doing something more accurate would probably involve at least  
6 mouth
positions, and if you're doing it in real time, you'd have to  
do a

reverse FFT. It can be done--there was a really good commercial
lip-synch program that generated Action Script to control mouth
positions. I don't know if it's still around--that was 5 years  
ago,
and it was pretty expensive (about $2,500 for one seat, I  
think). It
may even have been a Director Xtra that worked with a Flash  
Sprite,

but let's not talk about Director :-P

Cordially,

Kerry Thompson
 ___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

 ___

Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders





--
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders



Karl DeSaulniers
Design Drumm
http://designdrumm.com

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders



Karl DeSaulniers
Design Drumm
http://designdrumm.com

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders





--
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Karl DeSaulniers
Design Drumm
http://designdrumm.com


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Henrik Andersson
Before you start reinvesting the squarewheel, at least do some research 
on how people are doing it.


I did not learn enough from it personally, but I can tell that it is a 
good book:

http://www.dspguide.com/pdfbook.htm

Read it and then do the matching algorithm. This way you will avoid 
making a solution that deserves to end up on thedailwtf.com

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Eric E. Dolecki
SoundBooth

On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers k...@designdrumm.comwrote:

 Do you have SoundEdit? Or the like?


 Karl



 On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote:

  I think I might make waveform bitmaps and then try and compare against the
 current waveform (block EQ) - and if it's a close match, then fire off
 specific vowel events. If that works, I could do consonants too. If this
 works, I'll do jumping jacks and shots of Jack.

 So how would I compare two bitmaps to see if a waveform (
 On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers k...@designdrumm.com
 wrote:

  If you need any of these files or can't find them, lmk and I can send off
 list.

 Best,

 Karl



 On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote:

  Don't know if this will help, but have you looked into WaveAnalyzer.as
 or

 Flash MX - Audio: Sound completion event (The source files for this can
 be
 found in the Flash MX/Samples folder.)
 They both let you control the sound. I am thinking this will point you
 in
 a good direction. Its AS2 though.

 HTH,

 Karl


 On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote:

  Ya - I have the data for both things, but they extend over time and are

 difficult to compare. It's the boiling down the signatures into
 something
 simple and being able to read the playing audio looking for the match
 (or
 near match). I thought about using bitmap data and trying to match up
 waveforms, etc. but I don't know enough about it to pull that off. It
 seems
 like a hack in a way, but if it worked, who cares I suppose.

 On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano 
 califa010.flashcod...@gmail.com wrote:



  I'm not Henrik, but I've done some lip-synch stuff for Disney. We
 did
 it pretty much the way Eric described--we just used amplitude. It's
 not as accurate as Disney would demand on a film, but it's ok in the
 kids' game market.



  I see, amplitudes could be just good enough for some stuff.

 Although the speed and the intensitiy of the speech could give
 misleading
 results, I think. I'm under the impression that you should somehow try
 to
 compare the shape of the waves (somehow simplifiy your input to some
 value
 of sets of values that are easier to compare, possibly in a time
 window)
 and compare it in some meaningful way to precalculated samples to find
 a
 matching pattern. That's the part I have no clue about!

 Cheers
 Juan Pablo Califano

 2010/6/3 Kerry Thompson al...@cyberiantiger.biz

  Juan Pablo Califano wrote:


  Wow. That was really uncalled for.



 That was my reaction, too. I didn't see Eric as complaining--just
 asking. Maybe Henrik was just having a bad day.

  For me, the hard part, which you seem to imply is rather simple
 here,


  is


  *matching+ the input audio against said profiles. Admitedly, I don't


  know


  anything about digital signal processing and audio programming in


  general,

  but matching sounds a bit vague. Perhaps you could enlighten us, I

  you


  feel like.



 I'm not Henrik, but I've done some lip-synch stuff for Disney. We did
 it pretty much the way Eric described--we just used amplitude. It's
 not as accurate as Disney would demand on a film, but it's ok in the
 kids' game market.

 Doing something more accurate would probably involve at least 6 mouth
 positions, and if you're doing it in real time, you'd have to do a
 reverse FFT. It can be done--there was a really good commercial
 lip-synch program that generated Action Script to control mouth
 positions. I don't know if it's still around--that was 5 years ago,
 and it was pretty expensive (about $2,500 for one seat, I think). It
 may even have been a Director Xtra that worked with a Flash Sprite,
 but let's not talk about Director :-P

 Cordially,

 Kerry Thompson
  ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

  ___

 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




 --
 http://ericd.net
 Interactive design and development
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


 Karl DeSaulniers
 Design Drumm
 http://designdrumm.com

 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


 Karl DeSaulniers
 Design Drumm
 http://designdrumm.com

 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




 --
 http://ericd.net
 Interactive design and development
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 

Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Eric E. Dolecki
Dude, whether you know it or not, you come off being pretty arrogant with
your comments. Don't worry, I won't be ending up on the dailywtf anytime
soon.

On Thu, Jun 3, 2010 at 6:36 PM, Henrik Andersson he...@henke37.cjb.netwrote:

 Before you start reinvesting the squarewheel, at least do some research on
 how people are doing it.

 I did not learn enough from it personally, but I can tell that it is a good
 book:
 http://www.dspguide.com/pdfbook.htm

 Read it and then do the matching algorithm. This way you will avoid making
 a solution that deserves to end up on thedailwtf.com

 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




-- 
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-03 Thread Karl DeSaulniers
I would try using that to figure out a way of maping the sounds and  
then translate that to your project. You are able to see the wave  
forms in soundbooth? Haven't used it. If so, can you run your cursor  
over it at any point to get the readings? Might be a little trivial,  
but may yeild a pattern that you can utilize.


JAT

Karl

Sent from losPhone

On Jun 3, 2010, at 6:18 PM, Eric E. Dolecki edole...@gmail.com  
wrote:



SoundBooth

On Thu, Jun 3, 2010 at 6:39 PM, Karl DeSaulniers  
k...@designdrumm.comwrote:



Do you have SoundEdit? Or the like?


Karl



On Jun 3, 2010, at 5:09 PM, Eric E. Dolecki wrote:

I think I might make waveform bitmaps and then try and compare  
against the
current waveform (block EQ) - and if it's a close match, then fire  
off
specific vowel events. If that works, I could do consonants too.  
If this

works, I'll do jumping jacks and shots of Jack.

So how would I compare two bitmaps to see if a waveform (
On Thu, Jun 3, 2010 at 5:18 PM, Karl DeSaulniers k...@designdrumm.com

wrote:


If you need any of these files or can't find them, lmk and I can  
send off

list.

Best,

Karl



On Jun 3, 2010, at 3:37 PM, Karl DeSaulniers wrote:

Don't know if this will help, but have you looked into  
WaveAnalyzer.as

or

Flash MX - Audio: Sound completion event (The source files for  
this can

be
found in the Flash MX/Samples folder.)
They both let you control the sound. I am thinking this will  
point you

in
a good direction. Its AS2 though.

HTH,

Karl


On Jun 3, 2010, at 2:42 PM, Eric E. Dolecki wrote:

Ya - I have the data for both things, but they extend over time  
and are



difficult to compare. It's the boiling down the signatures into
something
simple and being able to read the playing audio looking for the  
match

(or
near match). I thought about using bitmap data and trying to  
match up
waveforms, etc. but I don't know enough about it to pull that  
off. It

seems
like a hack in a way, but if it worked, who cares I suppose.

On Thu, Jun 3, 2010 at 3:31 PM, Juan Pablo Califano 
califa010.flashcod...@gmail.com wrote:



I'm not Henrik, but I've done some lip-synch stuff for  
Disney. We

did
it pretty much the way Eric described--we just used amplitude.  
It's
not as accurate as Disney would demand on a film, but it's ok  
in the

kids' game market.




I see, amplitudes could be just good enough for some stuff.


Although the speed and the intensitiy of the speech could give
misleading
results, I think. I'm under the impression that you should  
somehow try

to
compare the shape of the waves (somehow simplifiy your input  
to some

value
of sets of values that are easier to compare, possibly in a  
time

window)
and compare it in some meaningful way to precalculated samples  
to find

a
matching pattern. That's the part I have no clue about!

Cheers
Juan Pablo Califano

2010/6/3 Kerry Thompson al...@cyberiantiger.biz

Juan Pablo Califano wrote:



Wow. That was really uncalled for.




That was my reaction, too. I didn't see Eric as complaining-- 
just

asking. Maybe Henrik was just having a bad day.

For me, the hard part, which you seem to imply is rather simple
here,



is




*matching+ the input audio against said profiles. Admitedly, I  
don't




know




anything about digital signal processing and audio programming  
in




general,


but matching sounds a bit vague. Perhaps you could  
enlighten us, I


you




feel like.





I'm not Henrik, but I've done some lip-synch stuff for  
Disney. We did
it pretty much the way Eric described--we just used  
amplitude. It's
not as accurate as Disney would demand on a film, but it's ok  
in the

kids' game market.

Doing something more accurate would probably involve at least  
6 mouth
positions, and if you're doing it in real time, you'd have to  
do a

reverse FFT. It can be done--there was a really good commercial
lip-synch program that generated Action Script to control mouth
positions. I don't know if it's still around--that was 5  
years ago,
and it was pretty expensive (about $2,500 for one seat, I  
think). It
may even have been a Director Xtra that worked with a Flash  
Sprite,

but let's not talk about Director :-P

Cordially,

Kerry Thompson
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

___


Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders





--
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders



Karl DeSaulniers
Design Drumm
http://designdrumm.com

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com

Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-02 Thread Henrik Andersson

Eric E. Dolecki wrote:

I have a face that uses computeSpectrum in order to sync a mouth with
dynamic vocal-only MP3s... it works, but works much like a robot mouth. The
jaw animates by certain amounts based on volume.

I am trying to somehow get vowel approximations so that I can fire off some
events to update the mouth UI. Does anyone have any kind of algo that can
somehow get close enough readings from audio to detect vowels? Anything I
can do besides random to adjust the mouth shape will go miles in making my
face look more realistic.



You really just need to collect profiles to match against. Record people 
saying stuff and match the recordings with the live data. When they 
match, you know what the vocal is saying.

___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders


Re: [Flashcoders] Question about approximate vowel detection in AS3

2010-06-02 Thread Eric E. Dolecki
This is a software voice, so nailing down vowels should be easier. However
you mention matching recordings with the live data. What is being matched?
Some kind of pattern I suppose. What form would the pattern take? How long
of a sample should be checked continuously, etc.?

It's a big topic. I understand your concept of how to do it, but I don't
have the technical expertise or foundation to implement the idea yet.

Eric


On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson he...@henke37.cjb.netwrote:

 Eric E. Dolecki wrote:

 I have a face that uses computeSpectrum in order to sync a mouth with
 dynamic vocal-only MP3s... it works, but works much like a robot mouth.
 The
 jaw animates by certain amounts based on volume.

 I am trying to somehow get vowel approximations so that I can fire off
 some
 events to update the mouth UI. Does anyone have any kind of algo that can
 somehow get close enough readings from audio to detect vowels? Anything I
 can do besides random to adjust the mouth shape will go miles in making my
 face look more realistic.


 You really just need to collect profiles to match against. Record people
 saying stuff and match the recordings with the live data. When they match,
 you know what the vocal is saying.
 ___
 Flashcoders mailing list
 Flashcoders@chattyfig.figleaf.com
 http://chattyfig.figleaf.com/mailman/listinfo/flashcoders




-- 
http://ericd.net
Interactive design and development
___
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders