On 06/11/2001 10:45:46 PM Mark Davis wrote:

[earlier]
> - Oracle could probably make a case for their name for UTF8 simply being
>an
> anachronism. After all, the original definition of UTF-8 did convert
> surrogate pairs as they are doing in what they call UTF8.

[now]
>UTF-8 was defined before UTF-16. At the time it was first defined, there
>were no surrogates, so there was no special handling of the D800..DFFF
code
>points.

The critical thing, though, is that in UTF-8 as originally designed, there
was no question about the meaning of < ED A0 80 ED B0 80 >, of < F0 90 80
80>, and whether either could mean U-00010000. They definitely did not mean
the same thing, and the former definitely did not mean U-00010000. So
Oracle would fail utterly if being judged on that basis.



- Peter


---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <[EMAIL PROTECTED]>



Reply via email to