On 26-Dec-04 Uwe Ligges wrote: > (Ted Harding) wrote: >> So I would like to ask R people for their recommendations >> for a program which would >> >> a) Take as input a sound file in one of the common formats >> (".wav", ".au") > > Ted, > > see package tuneR for reading Wave files. > > >> b) perform at least basic phonetic analysis (formants, F0, >> spectrograms, ... ) > > For F0 and spectograms see also tuneR.
Thanks for the pointer to tuneR, Uwe. I've had a look at the reference manual, and it does seem to be primarily oriented towards analysis of musical data. I'm not so much interested in getting the raw sound file into R and then doing basic frequency-type analysis on this, as in working on the output of a program which can apply phonetic expertise to the file and then present the characteristics of the phonetic analysis to R for further analysis. > "Formants" is a bit more tricky. We tried some analyses, but > since the definition of a formant is still not completly clear > to me, we haven't provided anything for formants in the package > yet. > > Do you know some good literature that gives a somwhat precise > definition? At least musicias only talk about something like > "raised" areas in the periodogram, which is not very helpful > given the missing definition of "raised". Well, I'm only a beginner! I could agree with your summary from what I have read so far. The account I have seen so far which best combines general accessibility with apparent technical throughness is the on-line Britannica article "Phonetics": http://www.britannica.com/eb/print?tocId=9108587&fullArticle=true and the following is a relevant quote: In summary, speech sounds are fairly well defined by nine acoustic factors. The first three factors include the frequencies of the first three formants; these are responsible for the major part of the information in speech. Characterizing the vocal tract shape, these formant frequencies specify vowels, nasals, laterals, and the transitional movements in voiced consonants. The frequencies of the fourth and higher formants do not vary significantly. The fourth factor is the fundamental frequency--roughly speaking, the pitch--of the larynx pulse in voiced sounds, and the fifth, the amplitude--roughly speaking, the loudness--of the larynx pulse. These last two factors account for suprasegmental information; e.g., variations in stress and intonation. They also distinguish between voiced and voiceless sounds, in that the latter have no larynx pulse amplitude. The centre frequency of the high-frequency hissing noises in voiceless sounds constitutes the sixth acoustic factor, and the seventh is the amplitude of these high-frequency noises. These two factors characterize the major differences among voiceless sounds. In more accurate descriptions it would be necessary to specify more than just the centre frequency of the noise in fricative sounds. The eighth and ninth factors include the amplitudes of the second and third formants relative to the first formant; the amplitudes of the formants as a whole are determined by the larynx pulse amplitude. These latter factors are the least important in that they convey only supplementary information about nasals and laterals. Earlier in the article it is stated that "The resonant frequencies of the vocal tract are known as the formants." but one has to read through the whole thing before the richer implications of this start to become apparent. The advantage of software like 'praat' is that phonetic experts have incorporated their understanding -- much clearer than I'm likely to achieve from the above -- into the software! I'm also grateful to Shravan Vasishth for responding with the suggestion of EMU. This seems at first sight to be less sophisticated than 'praat', though with what looks like a useful repertoire of "primitives" -- from its description: "EMU is a collection of software tools for the creation, manipulation and analysis of speech databases. At the core of EMU is a database search engine which allows queries based on the sequential and hierarchical structure of the annotations." It has the immediate advantage that it comes with facilities for direct linkage to S-Plus and R. Clearly worth looking into, but I don't know yet whether it would do enough of the dirty work for me! Thanks, Uwe and Shravan! If I get anywhere useful, I'll report back to the list. All best wishes, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 26-Dec-04 Time: 14:02:13 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html