-----BEGIN PGP SIGNED MESSAGE----- Clauses D20 and D21 of the Unicode Standard (3.0 or 3.1) read:
# D20 Compatibility decomposition: the decomposition of a character that # results from recursively applying /both/ the compatibility /and/ the # canonical mappings found in the names list of /Section 14.1, # Character Names List/, and those described in /Section 3.11, # Conjoining Jamo Behavior/, until no characters can be further # decomposed, and then reordering nonspacing marks according to # /Section 3.10, Canonical Ordering Behavior/. # # - A compatibility decomposition may remove formatting information. # # D21 Compatibility character: a character that has a compatibility # decomposition. # # - Compatibility characters are included in the Unicode Standard to # represent distinctions in other base standards. They support # transmission and processing of legacy data. Their use is discouraged # other than for legacy data. # - Replacing a compatibility character by its decomposiiton may lose # round-trip convertibility with a base standard. By definition D20, if a character has a canonical decomposition, then it also has a compatibility decomposition. This is correct, because NFKD includes all the decompositions that NFD does. The problem is with D21: if all characters that have a canonical decomposition also have a compatibility decomposition, then all of these are compatibility characters. Clearly that wasn't what was intended, and it is inconsistent with the following two bullet points. I think the correct definition of a compatibility character is a character with a compatibility decomposition that differs from its canonical decomposition (i.e. NFKC(c) != NFC(c)). Am I right? (Note that it wouldn't be correct to define a compatibility character simply as a character that has "<...> ..." entry in the decomposition field of the UCD; a counterexample is U+03D3.) - -- David Hopwood <[EMAIL PROTECTED]> Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/ RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01 Nothing in this message is intended to be legally binding. If I revoke a public key but refuse to specify why, it is because the private key has been seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip -----BEGIN PGP SIGNATURE----- Version: 2.6.3i Charset: noconv iQEVAwUBO9kOszkCAxeYt5gVAQEvfAgAhPW+uauuxRArxCWPJgYBW54AvAdg3yxB iATHjKED/4s+KkfMGP6kq3RzZpgD21MpeOacIG4+NWkgd8wHMRAvNWc2n+PEU+KJ A3Ngf/vDV+JZxhDX09s6lSxagfkQDhxB/bzgGMzpyCUdJshgiBsnTd4C8/IXbzgR KNi9XeZ+jEGYV+24S9stnMClmV/xMI9FR2QV2mA72Li5AgFR/DoRxSaeV4XiMw+3 RTJP5gVSQeUv1TsXD4X8J3z0YzxiFFzwPlIbG3o1BOcwjPrROmV0ULJQM1ufemGi Q/VJrkvPPyxibcOAk8Vb6LtA+jyyoi9TAod3JcLWDsEiIq1bfbcBKw== =tQg1 -----END PGP SIGNATURE-----