On Mon, 10 Sep 2001, Oyvind Holm wrote:

> > UTC is also working on restricting UTF-8 to something equivalent to
> > RFC 2279's definition (well, for the range U+0000 to U+10FFFF) in
> > Unicode 3.2. That's very good news I think.
>
> What will these restrictions be? Big changes?

Well, UTF-8 will be made simpler. Currently, Unicode-conformant UTF-8
decoders should accept 'irregular' UTF-8 (which is codepoint coded as
UTF-16, and then reencoded as UTF-8). With the change, there will be no
need for that anymore, and the decoder will be allowed to reject
irregulars, or even forget about their existance.

roozbeh

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to