Ronan Waide <[EMAIL PROTECTED]> writes: > On December 23, [EMAIL PROTECTED] said: >> Note that Unicode cannot (yet?) represent all the characters that >> Emacs can represent. > > Really? How is this the case? Or more to the point, what's the set of > characters in Emacs that can't be represented in Unicode?
In Emacs, Latin-1 ä and Latin-2 ä are two distinct characters. I think in Unicode there is only one ä. There was much talk about `Han unification'. I have no idea what that is, but I /think/ it means that some Chinese characters are identified with some Japanese characters. So from the the Unicode you don't know which character it is. (But applications might wish to use different glyphs depending on whether they're showing Chinese or Japanese text.) Note that this is just hearsay -- it might be completely wrong. In Emacs, the Chinese and the Japanese character are considered to be distinct. I think the new Unicode-based internal encoding in Emacs will offer some way around `Han unification', perhaps by using private extension areas in Unicode. In my previous message I should probably have said BMP instead of Unicode, as Unicode has extensibility built-in... -- ~/.signature is: umop ap!sdn (Frank Nobis) ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/bbdb-info BBDB Home Page: http://bbdb.sourceforge.net/