On Sun, 13 May 2018 22:15:10 -0800 James Kass via Unicode <[email protected]> wrote:
> Richard Wordingham asked, > > > Is this a reasonable approach to allowing both collation > > and suppressing needless homographs? My contribution to > > the rendering is only the provision of a font. > > If anything about this approach was unreasonable, one of the experts > on this list would probably have pointed it out by now. Not necessarily; some may still be recovering from the recent UTC meeting. Moreover, it took many years before we were told that there was no character to suppress word boundaries wrongly deduced by Thai breaking algorithms. The character we had been using, U+2060 WORD JOINER, is apparently only for suppressing line breaks. > Riding along with the insertion of the dotted circles by the USE > enables the actual users to see immediately that the text needs to be > modified in order to render reasonably on that system with the shaping > engine and font selected. If users consider any such insertion > inappropriate, then it's feedback time. The massive failure of USE was reported within hours of USE being announced on the Unicode forum. So far there has only been tinkering, and an encouragement of bad spelling. For example, at least about 23% of Northern Thai monosyllables can be rendered only by clear misspelling - see the results in http://www.wrdingham.co.uk/lanna/random_test.htm. The USE specification brushes over this with the statement, "Note: Tai Tham support is limited to mono-syllabic clusters", which gives the misleading impression that mono-syllabic clusters are supported. Basically, support is limited to (C)+(V)* clusters with a liberal interpretation of C and V. Crw and Cry aren't supported either. At the moment, one is generally better off using a Thai hack font that uses paiyannoi to toggle between the various forms and placements of Tai Tham characters. That has the advantage that the text is still intelligible when you have no font that renders it as Tai Tham. The main limitation of such schemes is in plain text. > > ... and it is frequently desirable for a font to be able > > to display its own name. > > Does the font name have to be in a Latin-based script? Postscript certainly gets unhappy if there isn't an ASCII name for it; I don't know the requirements for the various PDF generators. Richard.

