> On 20Jul2016, at 10:00, [email protected] wrote: > > From: Paul Tyson <[email protected]> > > Can anyone comment on the suitability, benefits, challenges, etc. of > equipping gnuspeech with a parser for International Phonetic Alphabet > input? > > Thanks and regards, > —Paul
The IPA is well documented and already familiar to trained linguists. I, for one, would welcome such a parser for IPA input to gnuspeech. Editing IPA: Typing IPA can be a challenge for many people who don’t have a lot of experience typing Unicode. I personally use gvim, a great editor for this purpose, but it has a steep learning curve at the beginning. Vim/Gvim is, however, well documented, and there’s a very helpful user community. To use Gvim, you need to supply a ~/.vimrc file (read first) and a ~/.gvimrc file (read second). The user community can supply vanilla examples to start with. I specify (in the .gvimrc file) set encoding=utf-8 (that’s the internal gvim buffer encoding) and set fileencodings=ucs-bom,utf-8,cp1252,iso-8859-1 (these are the encodings that are tried, in left-to-right order, when opening a file. ucs-bom means that it looks for a byte-order-mark on the file, and if it finds one, the file gets converted into a utf-8 buffer accordingly; if there’s no BOM, it will try to interpret the file as utf-8; if that fails, it will try cp1252, etc.) Gvim needs a mono-width font. I use the freely available DejaVuSansMono font, which has IPA glyphs, specifying (again in the .gvimrc file) set anti guifont=DejaVu\ Sans\ Mono:h14 That incantation works for OS X. For Unix (including Linux) use set anti guifont=DejaVu\ Sans\ Mono\ 12 For win32 use set anti guifont=DejaVu_Sans_Mono:12 Gvim also allows you to specify “keymap” files that can facilitate typing IPA, Greek, Russian, or whatever. I can’t go into that here, but it’s a powerful feature that allows you to enter “exotic” characters in the way that is most natural to you. There are also, of course, other editors that can be used to type Unicode The days of having to use ASCII transliterations are long over. Challenges: A string of IPA is just a sequence of Unicode characters, so there’s no special challenge there. The IPA is very rich, including a fair number of diacritic marks that specify details of pronunciation. Any parser would probably begin by implementing only a subset of IPA, and that subset would need to be well documented. The diacritic marks in Unicode are technically separate Unicode characters called “combining diacritics." There would probably be requests for corrections and augmentations for years to come. It’s very common to precede any use of IPA with a prose set of “conventions." For example, the ‘r’ letter in IPA technically represents an alveolar trilled r as in the Spanish “carro." The typical American English r is technically represented with an upside-down uppercase R, but the conventions might state that simple ‘r’ will be used instead of the technically correct IPA letter, especially when the IPA is being used for ‘broad’/‘phonemic’ transcription. Probably the user should simply be made responsible to provide the technically correct IPA letters and diacritics to the parser, at least at the beginning. Perhaps the parser for gnuspeech could _eventually_ include some user-supplied mapping table to map from the simplified/conventional/phonemicIPA text provided by the user to a more ‘narrow’/‘correct’/‘phonetic’ IPA, but that would open a can of worms. I’d _definitely_ start with requiring the user to supply detailed ‘narrow’/‘correct’/‘phonetic’ IPA. And let him/her do any required mapping from ‘phonemic’ to ‘phonetic’ IPA before the text gets to the gnuspeech IPA parser. Best, Ken ******************************** Kenneth R. Beesley, D.Phil. PO Box 540475 North Salt Lake UT 84054 USA _______________________________________________ gnuspeech-contact mailing list [email protected] https://lists.gnu.org/mailman/listinfo/gnuspeech-contact
