Re: why isn't Unicode the default encoding?

Martin v. Löwis Mon, 20 Mar 2006 13:44:24 -0800

John Salerno wrote:
> Robert Kern wrote:
> 
>>   http://www.joelonsoftware.com/articles/Unicode.html
> 
> That was fascinating. Thank you. So as it turns out, Unicode and UTF-8 
> are not the same thing? Am I right to say that UTF-8 stores the first 
> 128 Unicode code points in a single byte, and then stores higher code 
> points in however many bytes they may need? If so, I guess I had been 
> mislead by the '8' in the name, thinking that UTF-8 was another way of 
> storing characters in one byte (which would make it no different than 
> Latin-1, I suppose).


That's all correct, except for the last parenthetical remark: using
a single-byte character set isn't the same as using Latin-1. There
are various single-byte characters sets; they have names like Latin-2,
Latin-5, Latin-15, KOI8-R, CP437, windows-1252, and so on.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: why isn't Unicode the default encoding?

Reply via email to