[gnuspeech-contact] Re: Questions about TRM (GnuSpeech)

David Hill Sun, 18 Jan 2009 16:34:17 -0800

Hi Sasivimon,

I apologise for the delay in responding to you email. Been *very*busy with far too many incoming emails and other activities.


On Jan 13, 2009, at 10:32 AM, sasivimon wrote:

Hello,
I'm trying to experiment the human voice pronouncing using TRM anda spreadsheet application.But I can't figure out how to find the relation between thevariation of interpolation function of vocal tract shapeand the variation of formants(f1 and f2)(i think monet use beziercurve to calculate the interpolation of vocaltract shape) .Something like if I change the interpolation function from beziercurve to sine curve then how the f1 and f2 value would change.
My question are:
1. How the interpolation function of vocal tract shape affect tothe formant?

The TRM is simply a waveguide model of the vocal tube and contains noinformation about how to produce speech in any language. In fact,the TRM could equally well simulate a trumpet.

2. How come the monet use bezier curve to manipolate between 2vocal tract shape?


It doesn't really use "Bezier" curves.

Monet is a multipurpose editor that allows the TRM values for speechpostures to be stored, the shape of parameter tracks between posturesto be defined according to a collection of rules, with yet more rulesselected according to the particular combinations of speech posturesdefined by the input. These further rules decide which parametertrack rules to use, and what timing to apply to the quasi-steady-state and transitional regions of each dynamic change from posture toposture. When Monet produces its output speech, it further appliesrhythm and intonation models to vary to prosody of the speech.

3 And how can we improve the interpolation function compare to thereal human?

By using Monet to define/refine suitable trajectories betweensuccessive postures.

4. According to your website (http://pages.cpsc.ucalgary.ca/~hill/papers/synthesizer/body.html). I would like to know the principleof how you define the value of the parameters in the parametertable in the Appendix A?

Ideally, we would have X-Ray data that would allow us to define thearea functions for the TRM that would be stored as part of the Monetdatabase. In practice the interactive program "Synthesizer" is usedto determine which steady state values of the TRM region parameterswill produce the sounds that are needed. In the case of posturesthat do not produce much sound during closure (stops, fricatives),the concept of "locus" as determined by the Haskins Laboratories inthe 50s and 60s is used. The locus is the "origin" of the spectraltransitions, and represents the posture that would produce the"virtual sound" that represents the "locus" of the stop sound.

I read the pronunciation guide (Manzara & Hill 2002)
but I have no clue on how to interpret those notation in to anumerical parameter for TRM especially for non-English language.

You need to get Monet up and running on a Macintosh or under GNUstepon a Linux machine -- sources are in the savannah repository,accessible using SVN (ignore the CVS repository):


http://savannah.gnu.org/projects/gnuspeech

and you will have a much better idea of what is involved. There isalso a web page accessible from that page, under the heading "ProjectHome Page" that provides a short descriptive overview of the wholeGnuspeech system. It can be accessed directly at:


http://www.gnu.org/software/gnuspeech/

Hope this helps.



Sorry for my bad English.


Your English is fine, thank you.

All good wishes.

david
--------
David Hill
[email protected]
http://savannah.gnu.org/projects/gnuspeech
--------

Simplicity, patience, compassion. These three are your greatesttreasures (Tao Te Ching #67)

---------

_______________________________________________
gnuspeech-contact mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/gnuspeech-contact

[gnuspeech-contact] Re: Questions about TRM (GnuSpeech)

Reply via email to