> From: Markus Kuhn <[EMAIL PROTECTED]> > Date: Sat, 27 Oct 2001 19:27:51 +0100 (BST) > > On Thu, 25 Oct 2001, Eli Zaretskii wrote: > > > Is the internal representation still the special MULE format ??~ > > > > Yes. But the internal representation is not the problem here; ideally, > > users and Lisp programs shouldn't be worrying about how characters are > > represented internally. The problem is that characters are still not > > unified in Emacs 21. > > Not entirely. > > Internal representation does matter somewhat when it comes to the handling > of malformed UTF-8 sequences. I think it is highly desireable that the > UTF-8 -> emacs internal -> UTF-8 conversion roundtrip is made 100% binary > transparent.
I think this already works in Emacs 21.1, even though the internal representation is nowhere near UTF-8. If you see something else, please report that as a bug. > Using UTF-8 as the internal Emacs encoding is one way of achieving > continued guaranteed binary transparency, coming up with a tricky encoding > for malformed UTF-8 sequences is another one. I favour the former > approach, which is also what other UTF-8 capable modern editors do today. Emacs cannot use a pure UTF-8 encoding, since some cultures don't want unification, and it was decided that Emacs should not force unification on those cultures. So the planned Unicode-based internal representation resembles UTF-8 very closely, but is not identical to it. - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/