Re: unicode in emacs 21

Markus Kuhn Sat, 27 Oct 2001 10:57:43 -0700

On Thu, 25 Oct 2001, Eli Zaretskii wrote:
> > Is the internal representation still the special MULE format ??~
>
> Yes.  But the internal representation is not the problem here; ideally,
> users and Lisp programs shouldn't be worrying about how characters are
> represented internally.  The problem is that characters are still not
> unified in Emacs 21.


Not entirely.

Internal representation does matter somewhat when it comes to the handling
of malformed UTF-8 sequences. I think it is highly desireable that the
UTF-8 -> emacs internal -> UTF-8 conversion roundtrip is made 100% binary
transparent. Loading and saving a file that contains malformed UTF-8
sequences should not change them, but character encoding conversions are
prone to throw away information in the case of invalid source byte
streams.

Using UTF-8 as the internal Emacs encoding is one way of achieving
continued guaranteed binary transparency, coming up with a tricky encoding
for malformed UTF-8 sequences is another one. I favour the former
approach, which is also what other UTF-8 capable modern editors do today.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: unicode in emacs 21

Reply via email to