Re: recycling internationalized garbage

Ross Ridge Wed, 15 Mar 2006 00:10:46 -0800

Martin v. Löwis wrote:
> The point is that you can tell UTF-8 reliably. If the data decodes
> as UTF-8, it *is* UTF-8, because no other encoding in the world
> produces the same byte sequences (except for ASCII, which is
> an UTF-8 subset).


It should be obvious that any 8-bit single-byte character set can
produce byte sequences that are valid in UTF-8.   In fact I can't think
of any multi-byte encoding that can't produce valid UTF-8 byte
sequence.

                         Ross Ridge

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: recycling internationalized garbage

Reply via email to