From: "Peter Constable" <[EMAIL PROTECTED]> > There is no problem requiring a solution for combining marks used with > Latin script,* including IPA and Vietnamese, because all of the marks > that occupy a comparable space relative to the base have the same > combining class, meaning that normalization does not affect the order. > For such combinations, there is no "required normalization order". > > So, for instance, the sequences < ..., combining macron, combining > diaeresis, ... > and < ..., combining diaeresis, combining macron, ... > > are both in canonical order and *not* canonically equivalent.
You're right with the existing diacritics for Latin-scripted texts in existing Latin-based languages. This will be true as long as there will not be diacritics that span several positions in the stack (for example a diacritic that has left or right position, depending on the context). There's a counter example with the position of the circumflex on the lowercase t (I can't remember for which language it occurs, sorry), which is in some cases not the one that its combining class would normally take. In that case, the combining class of the diacritic will collide with the other diacritic that uses that same place with its distinct combining class, and the relative encoding order of these diacritics will be important to specify the layout order in the visual stack. Of course, as long as Latin diacritics will be rendered at the position implied by their combining class, this will be OK. In summary: the normative combining class of any combining character impacts the glyph representation, and font renderers are not free to place the diacritic according to the linguiistic needs, depending on the base character. And if it cannot move a above-detached diacritic to the wanted above-right-attached position, there will be a encoding conflict if there are also other diacritics (for example an arrow-above). For this reason, there may exist cases even in the Latin script, where canonical equivalence may have a bad effect, if diacritics need to be represented differently. Inserting a class-override control character in that case may also help solve the problem so that this text can be rendered and interpreted correctly.