Le 2011-09-22 à 19:42:00, g...@itchybit.org a écrit :

The task would be to identify from a live-talk the voice of the current speaker amongst several. Training before is also possible .. i guess this could be done for sure by utilizing a simple neural network trained on a FFT docemposition of the voices.. so there must be some software out for sure...

If I recall correctly, it's better to find the log of the amplitude of the fft, and then perhaps do fft again, before trying to find such timbral info.

an amplitude-wise log means that the spectra of filters add up instead of multiplying. That's supposed to make them easier to separate.

and the 2nd fft is supposed to make it easier to separate the vowel filters from the base pitch.

but I never tried any of that, or maybe I tried making a patch and then I didn't really knew how I'd use that and gave up... something like that.

 _______________________________________________________________________
| Mathieu Bouchard ---- tél: +1.514.383.3801 ---- Villeray, Montréal, QC
_______________________________________________
Pd-list@iem.at mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list

Reply via email to