Pim Blokland scripsit: > Besides, your example is proof that the implementation can change; > has to change. Where applications could use 8-bit characters to > store hex digits in the old days, they now have to use 16-bit > characters to keep up with Unicode...
You are confusing the *representation* of characters with the *choice* of characters. The representation of characters for hex digits can and does change: it can be ASCII, EBCDIC, or Unicode. The choice of characters is fixed: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A/a, B/b, C/c, D/d, E/e, F/f. > > There is also a HUGE semantic difference between D meaning the > > letter D and Roman numeral D meaning 500. > > and those have different code points! So you're saying Jill is > right, right? No. The Roman numeral characters are encoded solely for compatibility with East Asian character sets. (The same is true of the KELVIN SIGN.) > What we're talking about is different general categories, different > numeric values and even, oddly enough, different BiDi categories. > Doesn't that qualify for creating new characters? As a practical matter, trying to go through all legacy texts (now including legacy Unicode texts!) and disambiguate every instance of A-F/a-f between alphabetic and hexanumeric uses would be inconceivable. The justification for not splitting off Turkish i and I from general Latin, due to their unusual case mappings, is exactly the same. -- If you have ever wondered if you are in hell, John Cowan it has been said, then you are on a well-traveled http://www.ccil.org/~cowan road of spiritual inquiry. If you are absolutely http://www.reutershealth.com sure you are in hell, however, then you must be [EMAIL PROTECTED] on the Cross Bronx Expressway. --Alan Feur, NYTimes, 2002-09-20