Philippe Verdy wrote: > There's a counter example with the position of the circumflex on the > lowercase t (I can't remember for which language it occurs, > sorry), which is > in some cases not the one that its combining class would > normally take.
There are also the cases of comma below a small g (Lithuanian), which is rendered turned above the g, and of ring below g (IPA) that should be rendered above the g... Neither of these invalidate, or puts to question, the combining classes of comma below (and cedilla...) or ring below, as far as I can see. So far, it has been noticed that some Hebrew and Arabic marks, mostly the vowel marks, have inappropriate combining classes. The solution suggested by the UTC is to use CGJ. But it also has to be simple and practicable. Putting a CGJ after each occurrence of the characters with badly assigned combining class effectively gives them a combining class of 0. Perhaps not ideal, and indeed a kludge. But simple and practical. A keyboard layout, for instance, can just generate a CGJ after each troublesome Arabic and Hebrew mark. With current keyboard layout specification mechanisms, that's about the best that can be done on the keyboard side of it. Removing superfluous CGJs should be done by a separate utility. Trying to build that into normalisation is probably not such a good idea. Defining new characters to replace the troublesome ones, a more elegant solution, has been rejected by the UTC. On compatibility grounds, IIRC. /kent k
smime.p7s
Description: S/MIME cryptographic signature