On Thu, 25 Oct 2001, Eli Zaretskii wrote: > > Is the internal representation still the special MULE format ??~ > > Yes. But the internal representation is not the problem here; ideally, > users and Lisp programs shouldn't be worrying about how characters are > represented internally. The problem is that characters are still not > unified in Emacs 21.
Not entirely. Internal representation does matter somewhat when it comes to the handling of malformed UTF-8 sequences. I think it is highly desireable that the UTF-8 -> emacs internal -> UTF-8 conversion roundtrip is made 100% binary transparent. Loading and saving a file that contains malformed UTF-8 sequences should not change them, but character encoding conversions are prone to throw away information in the case of invalid source byte streams. Using UTF-8 as the internal Emacs encoding is one way of achieving continued guaranteed binary transparency, coming up with a tricky encoding for malformed UTF-8 sequences is another one. I favour the former approach, which is also what other UTF-8 capable modern editors do today. Markus -- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/> - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/