On Thu, 21 Mar 2002, Masataka Ohta wrote: > Unicode is not usable in international context. [...] Unicode is > usable in some local context. [...] However, the context information > must be supplied out of band.
Let me see if I can understand this argument about Unicode and local context. I am an English speaker who can't tell the difference between the Chinese character that appears as the second character of the Chinese word for the city that I call "Beijing", and the Japanese character that appears as the second character of the Japanese word for the city that I call "Tokyo". I believe that (as used in the city names) both characters mean something like the English word "capital". Say there's a Chinese character that looks (to uneducated western eyes) like a box with three legs and a hat, and a Japanese character that looks (to uneducated western eyes) like a box with three legs and a hat. Say the Chinese character looks slightly different from the Japanese character, but a Chinese person can easily recognise the Japanese character and understand its meaning in context, and a Japanese person can easily recognise the Chinese character and understand its meaning in context. As far as I understand, Unicode would say that these are not two different characters, but just different display forms of the same unified character (or whatever the correct technical terms are). Display software would have to have out of band knowledge to help it choose between the Chinese and Japanese display forms. As far as I understand, absence of out of band knowledge could lead to the hypothetical Unicode character <CJK character that looks a bit like a box with three legs and a hat> being displayed as if it were <Chinese character that looks a bit like a box with three legs and a hat>, even if the author's intent was to display <Japanese character that looks a bit like a box with three legs and a hat>. As far as I understand, Masataka Ohta considers this to be a fatal flaw in Unicode. I hope he will correct me if I have misunderstood his objection. I don't know enough to tell whether the difference between corresponding Chinese and Japanese characters is analogous to a font difference or a spelling difference, but the "ignorant westerner can't tell the difference" test biases me towards the "font difference" side. If they are analogous to spelling differences, then I would say that unifying the different characters was probably an error in Unicode, but that IDN should not try to undo that unification. Either way, I think that IDN should document the potential problem but not try to fix it. (In contrast, I think I have learned enough by following the past <n> months of discussion to tell that the differences between Traditional and Simplified Chinese are analogous to spelling differences, and so IDN should not try to unify them.) --apb (Alan Barrett)
