Re: Unicode String Models

Eli Zaretskii via Unicode Tue, 11 Sep 2018 19:36:58 -0700

> Date: Wed, 12 Sep 2018 00:13:52 +0200
> Cc: [email protected]
> From: Hans Åberg via Unicode <[email protected]>
> 
> It might be useful to represent non-UTF-8 bytes as Unicode code points. One 
> way might be to use a codepoint to indicate high bit set followed by the byte 
> value with its high bit set to 0, that is, truncated into the ASCII range. 
> For example, U+0080 looks like it is not in use, though I could not verify 
> this.


You must use a codepoint that is not defined by Unicode, and never
will.  That is what Emacs does: it extends the Unicode codepoint space
beyond 0x10FFFF.

Re: Unicode String Models

Reply via email to