Re: Unicode 3.1: UTF-8

Mark Davis Thu, 01 Feb 2001 01:12:44 -0800
This is not an omission. This issue was debated at great length in the
Unicode technical committee, and the precise wording was agreed to by the
committee.

Mark
----- Original Message -----
From: "John Cowan" <[EMAIL PROTECTED]>
To: "Unicode List" <[EMAIL PROTECTED]>
Sent: Wednesday, January 31, 2001 11:18
Subject: Unicode 3.1: UTF-8


> I propose that the distinction between illegal and irregular UTF-8
> code sequences (D36bc) be eliminated.  Since there are no code points
> between U+D7FF and U+E000 (the apparently intervening code points
> are UTF-16 code units, but not Unicode code points)
> the corresponding UTF-8 code sequences should be illegal.
>
> This can be achieved by replacing the U+1000..U+FFFF row in
> Table 3.1B as follows:
>
> U+1000..U+CFFF   E1..EC   80..BF   80..BF
> U+D000..U+D7FF   ED       80..9F   80..BF   [9F underscored]
> U+E000..U+FFFF   EE       80..BF   80..BF
>
> --
> There is / one art             || John Cowan <[EMAIL PROTECTED]>
> no more / no less              || http://www.reutershealth.com
> to do / all things             || http://www.ccil.org/~cowan
> with art- / lessness           \\ -- Piet Hein
>
Re: Unicode 3.1: UTF-8

Reply via email to