Hank Tt <[EMAIL PROTECTED]> writes: >Hi, > >I'm trying to make a UCM file to feed to enc2xs. The legacy encoding for >Taiwanese romanization *must* have its code points mapped to Unicode >character sequences, for the simple reason that the UCS lacks the >corresponding precomposed characters (and is unlikely to have them in the >future, as they are composable using existing characters from the Latin >script and the Diacritical Combining Marks blocks). (See [1] for script >details.)
Are the Unicode character sequences in [1] normalized? Can you explain what the diacritics mean I assume '`^ etc. are tone marks? What do the macron and dot and dots-below signify?