On Tue, 15 May 2018 04:19:42 -0800 James Kass via Unicode <unicode@unicode.org> wrote:
> On Mon, May 14, 2018 at 11:31 AM, Richard Wordingham via Unicode > <unicode@unicode.org> wrote: > > > ... One could argue that the three positions require > > different glyphs for SIGN U. Each font would need its own PUA. > > Or a consensus. One would end up with a large glyph list to accommodate all designs. Imagine applying this approach to Devanagari, with all its Sanskrit conjuncts to be supported although some converters would only target a small subset. > > ... There are several > > places in Tai Tham layout where I want to swap glyphs round, but for > > the layout engine to do so for me would cause grief for other Tai > > Tham fonts. This rearrangement cannot be delegated to the rendering > > engine. There are Tai Tham fonts which handle Indic rearrangement > > in the ccmp feature, but they are then totally defeated by either > > ccmp not being enabled or by the USE doing basic Indic shaping. > > Suppose the OpenType specs were revised to include a bit which could > be set for disabling basic Indic shaping by the USE? I wouldn't set > it if I were just starting out to make a font for a complex script > requiring basic Indic shaping, and cannot imagine why anyone else just > starting out would. One would need to set the bit while the script was not yet in Unicode, and then you may well need to set it when the USE bites. As another concrete example, one couldn't use USE for the Khmer script - it too has CVC syllables. I believe there are also lurking problems with the ordering of the rarer marks. You'd come unstuck if you found your script had both preposed subscripts and optionally preposed matras. The USE can't handle both in the same syllable. One might need to ignore syllable boundaries before Indic re-ordering, though that's probably a preference rather than a requirement. Tai Tham has a troublesome mark, U+1A58 TAI THAM SIGN MAI KANG LAI. In the West, it's 'Consonant final' and is a mark above or above right. In the East, it works like Burmese kinzi, and acts like a repha. Revision 1 of the Maefahluang Dictionary of Northern Thai sits on the border. In its text, it behaves one way in some environments, and the other way in others. Finally, many scripts had fonts before windows supported them. Indeed, isn't significant Tai Tham renderer support on Windows 7 restricted to HarfBuzz clients? (I don't believe M17n is significant, and I fear my interfacing set-up only works for my fonts.) > >> A good keyboard driver ... > > > > It won't work. The text input delivered by X still needs to be > > supported, and without modifying the application, X can only input > > one character at a time. Not everyone uses an 'input method'. > > Every keyboard uses a driver, though. I can't speak for "X", but my > understanding is that the keyboard driver acts as sort of a buffer > between the user's key strokes and the application. X attempts to present the key strokes to the application. The application may chose to present these key stroke to an input method to handle, but these input methods are not reliable. I have a battery of three inputs methods for most applications on Ubuntu - raw X keyboard mapping, ibus using Keyman for Linux, and fcitx using M17n. Additionally, I find Emacs is easier to use if I talk to it in ASCII and use its input methods for other character sets. The advantage there is that Emacs knows whether I am entering a command, which must be in ASCII, or text, for which it uses the active input method. Another issue is that normalised text can be highly inconvenient for a font. HarfBuzz chooses a non-standard normalisation for several scripts simply because that makes things easier for a font. > > I've seen an implementation of the USE render > > canonically equivalent strings differently. ... > > Because the USE failed or because the font provided look-ups for each > of those strings to different glyphs? Remember that the USE changes the string presented to the font by inserting dotted circles. Essentially, <tone, SAKOT, consonant> and <SAKOT, tone, consonant> can be penalised differently - Microsoft inserts more dotted circles than does HarfBuzz. Richard.