On Wednesday, November 24th, 2004 16:26Z Tim Greenwood va escriure: > All of the spacing combining marks (general category Mc) except > musical symbols have a canonical combining class of 0. > Why is this?
About the Indic vowel signs, I assume it is this way to avoid them being reordered (in weird ways), particularly when there are multi piece vowels involved. > The Canonical > Combining Class Values in UCD.html has entries and values for left > attached and right attached - but no characters have these values. They (the Indic vs) happen to have >0 class before v.2.1.8 (1998). I believe UCD.html still reflects this past state. For example, the accompagning README tells us: Note that as of the 2.1.8 update of the Unicode Character Database, the decompositions in the UnicodeData.txt file can be used to recursively derive the full decomposition in canonical order, without the need to separately apply canonical reordering. However, canonical reordering of combining character sequences must still be applied in decomposition when normalizing source text which contains any combining marks. I assume it has to do with the work of TR15 that you might consult (http://www.unicode.org/reports/tr15/tr15-10.html) for enlightment. Antoine

