...
From _The Unicode Standard 4.0_, 3.11 at http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf:
<< If combining characters have different combining classes--for example, when one nonspacing mark is above a base character form and another is below it--then no distinction of graphic form or semantic will result. >>
Later:
<< _D46 Combining class:_ A numeric value given to each combining Unicode character that determines with which other combining characters it typographically interacts.
From _The Unicode Standard 4.0_, 4.3 at http://www.unicode.org/versions/Unicode4.0.0/ch04.pdf:
<< Each combining character has a normative canonical _combining class._ This class is used with the canonical ordering algorithm to determine which combining characters interact typographically and to determine how the canonical ordering of sequences of combining characters takes place. >>
This indicates that characters in different classes should not interact typographically.
Rather, it defines that they do not. But since this is not true on any reasonable intuitive definition of "interact typographically" (as we have seen with Hebrew vowel points), this statement makes sense only as a counterintuitive definition of "interact typographically".
Cedilla belongs to class 202 meaning "Below attached" according to http://www.unicode.org/Public/UNIDATA/UCD.html#Canonical_Combining_Class_Values.
However, from _The Unicode Standard 4.0_, 7.1:
<< A similar situation can be seen in the Latvian letter U+0123 LATIN SMALL LETTER G WITH CEDILLA. In good Latvian typography, this character is always shown with a rotated comma over the g, rather than a cedilla below the g, because of the typographical design and layout issues resulting from trying to place a cedilla below the descender loop of the g. Poor Latvian fonts may substitute an acute accent for the rotated comma, and handwritten or other printed forms may actually show the cedilla below the g. >>
Later at 7.7:
<< U+0326 COMBINING COMMA BELOW is sometimes rendered as U+0326 COMBINING COMMA BELOW is sometimes rendered as U+0312 COMBINING TURNED COMMA ABOVE on a lowercase "g" to avoid conflict with the descender. >>
So we have two cases noted where characters with combining class 202 (Below attached) can by Unicode specifications be rendered as if they belonged to combining class 214 (Above attached).
In such cases they obviously do not interact with other combining class 202 characters but rather would interact with combining class 214 characters. Currently there are none--which is a blessing. :-)
But this still breaks the model.
Also, on an intuitive definition of "interact typographically", this shifted comma below would interact with any centred above accent, class 230. For that matter, even in its normal position a comma below, class 202, would interact e.g. with a macron below, class 220. True, only one ordering is possible as the macron cannot come closer to the character than the attached comma below, so there is no problem for the canonical ordering model. But the position of the macron still has to be shifted to avoid cutting across the comma below, so the "interact typographically" rule is broken. The implication is that a font designer cannot assume that glyph positioning adjustments are required only between adjacent characters in canonical order, despite what the "interact typographically" rule might suggest.
-- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/