Timothy: I also cannot call myself an expert with Mongolian, although I have worked with it to a limited extent, and have some access to people who have more. The person you mentioned, Oliver Corff is also behind the gif I mentioned the other day, and I have used that in the past. (See http://userpage.fu-berlin.de/~corff/im/MLS/overview.MLS.html) Another program I used is Xenotype. (http://www.xenotypetech.com/). However, in view of the fact that the script is mainly in actual use in Inner Mongolia, and since I cannot currently (but I will try) find details on what I know is the most used program there, I would not feel comfortable to write something up based upon Western programs only. (For the Chinese program, see http://www.founder.com.cn/ics/gb/content/2001-10/31/content_310.htm)
I have been trying to collect enough information. In onse sense, information about *regular* formation is common enough, as is information about the somewhat grey area of "predictable irregular" behaviour. Exhaustive information on what happens in foreign or archaic words (or sometimes, just to distinguish between homophones) etc. is much more difficult to get by; the Japanese entry for Mongolian in the writing volume of the Sekai gengo daijiten does have some pertinant remarks and examples, but looking through dictionaries one can find others. One problem I have already encountered: in Unicode, Manchu (and Sibe etc.) is considered part of Mongolian. That is, most "Mongolian letters" defined as such are used for Manchu as well; those called "Manchu" are simply the small subset not used for Mongolian at all. For proscribed behaviour, and the use of variants, should one take Manchu and Mongolian as a whole? That is, to give a real example, if there is a final "n" and a final "N", and the latter takes two different glyph variants in Manchu and Mongolian, is one variant selector sufficient (with meaning: in Mongolian G', in Manchu G"), or are two necessary? While the first would seem sufficient, the fact that there are THREE variant selectors and the letters they are listed with in the table, I wonder whether it's not the latter which is meant by Unicode. In practice, when these little-used script are used, they are likely to be used by the same people in the same contexts and same documents, and perhaps even the same fonts, so the latter makes some practical even if not purely theoretically necessary. Martin Heijdra ----- Original Message ----- From: "Timothy Partridge" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Monday, July 15, 2002 1:53 PM Subject: Re: Variant selectors in Mongolian > You recently said: > > > > I believe Unicode > > > should take an explicit position on this as it has important implications > > > for successful rendering of plain text on various platforms. > > > > I think Tim Partridge and Martin Heijdra and anyone else actually > > working on Mongolian with some implementation experience should > > write up a technical note on this and present it to the UTC > > so that it *could* take an explicit position based on input from > > experts. Right now there *are* no Mongolian script experts involved > > in the UTC or the editorial committee, and that is one of the fundamental > > reasons why the text seen in the standard isn't very clear yet. > > It is going to take *somebody* who understands the Unicode text > > model *and* Mongolian to write up such text. > > I am most definitely not a Mongolian expert, I simply have an interest in > writing systems in general. I am familiar with the Unicode text model and I > am willing to work with script experts to develop something. > > I would suggest we could produce the following: > > A description of the normal behaviour of the Mongolian writing systems, with > examples of Unicode character sequences and their visual appearance. This > would be at a similar level to the other script descriptions concentrating > on rendering information rather than linguistic details. > > Some examples of unusual behaviours and how to obtain those results using > appropriate character sequences. > > A machine readable cross reference between the characters and their various > glyphs (presentation forms). This would include the valid combinations of > variation selectors. Many characters share the same glyph and I think it > would help implementers to be certain that two glyphs were indeed the same > rather then having a subtle difference. > > A proposed algorithm for converting character sequences to glyphs. Ideally > this would > Be simple to implement > Cover all the normal behaviour correctly without use of varient selectors > Behave in a way that is readily predictable by someone familiar with the script > > A simple prototype implementation of the algorithm would be useful for > checking it behaved as expected. > > Martin, are you interested in working on such a project? It would be helpful > if one or more of the authors of UNU/IIST report 170 (Myatav Erdenechimeg, > Richard Moore and Yumbayar Namsrai) were available for consultation. A brief > web search reveals that Oliver Corff has been working on rendering Mongolian > using TeX and he has tried some experimental Unicode support. Is anyone > familiar with his work? > > Regards, > > Tim > > -- > Tim Partridge. Any opinions expressed are mine only and not those of my employer > >