"Eli Zaretskii" <[EMAIL PROTECTED]> writes: >> The GNU Emacs/Unicode proposal I've seen seems to have this property, >> too. (At least the proposal is ambiguous, and one interpretation is >> that you can encode a single character in multiple ways.) > > Unless you refer to the CNS plane and Japanese Han characters, which > were deliberately left ununified (in addition to the Unicode > codepoints for those characters), I think you are mistaken.
I hope so. ;-) > Could you please point out where in the proposal do you see that a > character can be encoded in multiple ways? I think now that the surrogate stuff has been explained, the encoding to to UCS-E (Unicode-compatible Character Set for Emacs) is indeed unambiguous. However, UTF-E (the buffer encoding) opens possibilities for different encodings of the same UCS-E code point, but this can be resolved, I think. -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/