On Wed, 16 May 2018 05:23:08 -0800 James Kass via Unicode <unicode@unicode.org> wrote:
> Note that although the proposal gave canonical combining class > zero to both the tone marks and the vowel signs, the on-line Unicode > data gives canonical combining class 230 to the tone marks. There were several changes from ccc=0 to non-zero that were sneaked in between the UTC agreeing to proceed with the proposal and Unicode 5.2 being published. That may have been a test of vigialnce; we failed. I have seen no benefit from the changes - U+A160 TAI THAM SIGN SAKOT is not a virama (it should not appear in valid text), and having the tone marks and the invisible stacker have distinct non-zero classes has caused lots of irritation. We should probably have risked Tai Tham being excluded from the BMP and gone for the Tibetan model; normalised would not then damage Tai tham text. > > **The placement may be different to that of MAI KANG > > in /bɔː waː/ ᨷᩴ᩠᩵ᩅᩣ <BA, MAI KANG, TONE-1, SAKOT, WA, > > SIGN AA> or ᨷᩴ᩠ᩅ᩵ᩣ <BA, MAI KANG, SAKOT, WA, TONE-1, > > SIGN AA> - I don't know whether the first or the second > > tone mark is dropped. > FWIW, neither is dropped in the display here, although they don't > display identically. The first string shows TONE-1 positioned to the > right of MAI KANG, the second string superimposes them. (Windows 7 > running LibreOffice in order to enable the USE from HarfBuzz.) The full uncontracted writing is <BA, MAI KANG, TONE-1, WA, TONE-1, SIGN AA>. Both syllables have TONE-1, but I have not seen two identical tone marks from different phonetic syllables in the same stack. The person typing the contraction drops a tone mark, not the rendering system. > Substituting U+1A36 TAI THAM LETTER NA for BA in the above strings, > ᨶᩴ᩠᩵ᩅᩣ ᨶᩴ᩠ᩅ᩵ᩣ, and trying to get the ligature are in the attached > *.PNG file. Here's the four strings for the PNG: > > \u1A36\u1A74\u1A75\u1A60\u1A45\u1A63 > \u1A36\u1A74\u1A60\u1A45\u1A75\u1A63 > \u1A36\u1A75\u1A63\u1A74 > \u1A36\u1A63\u1A74\u1A75 A lot of fonts have trouble ligating NA and AA when there is material between them. (Hint: Classify all non-spacing subscript consonants as marks, and spacing subscript consonants as bases, and set the ligating lookup to ignore marks.) Your example appears to be using the font called 'A Tai Tham KH New'. While the only way to type Pali _bho_ 'O' after other text in this font or 'A Tai Tham KH' is to enter the correct sequence <LOW PHA, SIGN E, SIGN AA>, the former font cannot render Pali _mano_ 'mind' (also used in Northern Thai and probably also Tai Khuen) if one types the correct sequence <MA, NA, SIGN E, SIGN AA>. One has to type <MA, NA, SIGN AA, SIGN E>! The *older* font 'A Tai Tham KH (at Version 2.0) does render the correct spelling properly. As an example of correct rendering, I include the Pali for 'O mind!', _bho mano_, encoded <LOW PHA, SIGN E, SIGN AA, MA, NA, SIGN AA, SIGN E>, as rendered by the Lamphun font. Richard.