On 24/09/2003 14:58, Jon Hanna wrote:

... For example since following the decomposition <U+0104> -> <U+0041, U+0328> there can be no character that is unblocked from the U+0041 that will combine with it, hence there is no circumstance in which they will not be recombined to U+0104 and hence dropping that decomposition from the data will not affect NFC (the relevant data would still have to be in the composition table, as the sequence <U+0041, U+0328> might occur in the source code).



Is this actually correct? For example, if I have in my data the string <U+0104, U+05B0> (which I know is garbage, but that is irrelevant), that will decompose and reorder to <U+0041, U+05B0, U+0328>, as U+05B0 has a higher combining class (202) than U+05B0 (10). What does this become in NFC? Is the reordering reversed and the combination reapplied?

This is not only a theoretical issue as the same applies to some real combinations. There was discussion only last week on the bidi list of a form which might be encoded <U+064A, U+0652, U+0654> but which would be messed up if composed into <U+0626, U+0652>.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to