On Thu, 9 May 2019 11:55:23 -0400 Ed Trager via Unicode <unicode@unicode.org> wrote: > ** A good use case is the Tai Tham word U+1A27 U+1A6A U+1A60 U+1A37 , > transcribed to Central Thai script as จูบ, (*to kiss*). Currently, > people are writing this as U+1A27 U+1A60 U+1A37 U+1A6A ("จบู") which > violates the "phonetic ordering" but is the current workaround > because USE is still broken for TAI THAM. > > REFERENCE DOCUMENT: > http://www.unicode.org/L2/L2018/18332-tai-tham-ad-hoc-report.pdf
How is this a good test case? The 6th preliminary recommendation reads, "To represent a cluster, regardless of the phonetic order CCV or CVC, a consonant sign should always be encoded before the vowel sign, unless the vowel sign has inline advance and is apparently followed by the consonant sign". If this recommendation is adopted, then the spelling "U+1A27 U+1A6A U+1A60 U+1A37" will be wrong. Now, SIGN U and SIGN UU before subscript BA, HIGH PA and LOW YA aren't always written as though they followed the subscript consonants in phonetic order. Sometimes the vowel sign is written in the bottom left of the syllable. Presumably we'll need 3 or 4 new signs: TAI THAM UNAMBIGUOUS UB TAI THAM UNAMBIGUOUS UUB TAI THAM UNAMBIGUOUS UY TAI THAM UNAMBIGUOUS UUY (?) I'm not sure that the fourth one can occur. An example of the contrast is shown in the attached files luynam.png, with first orthographic syllable <LA, SIGN U, SAKOT, LOW YA>, and yukya.png, with the first orthographic syllable <HIGH HA, SAKOT, LOW YA, SIGN U>. I wonder how we'd be supposed to encode ᩉᩖᩩ᩠᩶ᨿ (currently <HIGH HA, MEDIAL LA, SIGN U, TONE-2, SAKOT, LOW YA> 'to crawl'? The simplest way would be to encode it as <HIGH HA, MEDIAL LA, SAKOT, LOW YA, SIGN U, TONE-2>, which currently encodes the unlikely ᩉᩖ᩠ᨿᩩ᩶. Will good fonts be expected to move the vowel left and down from the subscript LOW YA to the MEDIAL LA? Or will we need to encode it with *TAI THAM UNAMBIGUOUS UY? Richard.