Re: perlunitut - feedback appreciated

Philip Newton Sun, 11 Nov 2001 23:07:53 -0800

On Sun, 11 Nov 2001 12:57:27 -0800, in perl.unicode you wrote:

> ISO Latin-1 characters encoded as 10-FF in single bytes are not Unicode.


Hm? ISO Latin-1 characters from 00 to 7F encoded in single bytes
represent the same Unicode characters as those bytes interpreted as
UTF-8, simply because ASCII is a subset both of Latin-1 and UTF-8. 00 to
7F is that common subset.

> There is no Unicode transformation format or other encoding that permits
> this. The code point range is actually x000010-x0000FF, and the encodings
> are
> 
> 0000000010000000  0000000011111111 UTF-16 Big Endian

That first number of 0x80, not 0x10. If you meant 0x80 .. 0xFF, then I
agree with you.

Cheers,
Philip

Re: perlunitut - feedback appreciated

Reply via email to