Have a look at this stuff, don't know if there is source, but it might be a start...

http://www.allflashwebsite.com/article/real-time-lip-sync-in-flash



On 03/06/2010 15:03, Eric E. Dolecki wrote:
I've tried running software voice vowels through the system and I am able to
create "signatures" for the vowels that's somewhat accurate (depending on
how it's influenced in a word or if it's standalone). I've run them several
times and my values always seem to match (which is good). I end up with a
very long stream of numbers for a signature because of the enter frame. I am
wondering what the best way to compare the currents to over a period of time
to match known values might be. What's a fast/best lookup means to check
against?

For instance, a spoken "A" for me looks like this:

speech loaded
0.16304096207022667
0.16304096207022667
0.16304096207022667
0.16304096207022667
0.4167095571756363
1.840158924460411
1.840158924460411
2.3130274564027786
2.7141911536455154
2.7141911536455154
5.49285389482975
8.781380131840706
9.142853170633316
9.142853170633316
... TONS more data...




On Thu, Jun 3, 2010 at 8:23 AM, Eric E. Dolecki<edole...@gmail.com>  wrote:

I don't think that's enough. Has anyone seen pitch detection in AS3 yet (no
microphone source)? That might be enough but I'm not sure.


On Thu, Jun 3, 2010 at 5:55 AM, Karl DeSaulniers<k...@designdrumm.com>wrote:

You could try matching say a lowered jaw with low octaves and a cheeky jaw
with high octaves.
JAT


Karl


On Jun 2, 2010, at 3:20 PM, Eric E. Dolecki wrote:

  This is a software voice, so nailing down vowels should be easier.
However
you mention matching recordings with the live data. What is being
matched?
Some kind of pattern I suppose. What form would the pattern take? How
long
of a sample should be checked continuously, etc.?

It's a big topic. I understand your concept of how to do it, but I don't
have the technical expertise or foundation to implement the idea yet.

Eric


On Wed, Jun 2, 2010 at 4:13 PM, Henrik Andersson<he...@henke37.cjb.net
wrote:
  Eric E. Dolecki wrote:
  I have a face that uses computeSpectrum in order to sync a mouth with
dynamic vocal-only MP3s... it works, but works much like a robot mouth.
The
jaw animates by certain amounts based on volume.

I am trying to somehow get vowel approximations so that I can fire off
some
events to update the mouth UI. Does anyone have any kind of algo that
can
somehow get close enough readings from audio to detect vowels? Anything
I
can do besides random to adjust the mouth shape will go miles in making
my
face look more realistic.


  You really just need to collect profiles to match against. Record
people
saying stuff and match the recordings with the live data. When they
match,
you know what the vocal is saying.
_______________________________________________
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders



--
http://ericd.net
Interactive design and development
_______________________________________________
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

Karl DeSaulniers
Design Drumm
http://designdrumm.com


_______________________________________________
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders



--
http://ericd.net
Interactive design and development




_______________________________________________
Flashcoders mailing list
Flashcoders@chattyfig.figleaf.com
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

Reply via email to