On Sun, 11 Nov 2001 12:57:27 -0800, in perl.unicode you wrote: > ISO Latin-1 characters encoded as 10-FF in single bytes are not Unicode.
Hm? ISO Latin-1 characters from 00 to 7F encoded in single bytes represent the same Unicode characters as those bytes interpreted as UTF-8, simply because ASCII is a subset both of Latin-1 and UTF-8. 00 to 7F is that common subset. > There is no Unicode transformation format or other encoding that permits > this. The code point range is actually x000010-x0000FF, and the encodings > are > > 0000000010000000 0000000011111111 UTF-16 Big Endian That first number of 0x80, not 0x10. If you meant 0x80 .. 0xFF, then I agree with you. Cheers, Philip