Re: Python usage numbers

Terry Reedy Sun, 12 Feb 2012 19:17:13 -0800

On 2/12/2012 5:14 PM, Chris Angelico wrote:

On Mon, Feb 13, 2012 at 9:07 AM, Terry Reedy<tjre...@udel.edu>  wrote:

The situation before ascii is like where we ended up *before* unicode.
Unicode aims to replace all those byte encoding and character sets with
*one* byte encoding for *one* character set, which will be a great
simplification. It is the idea of ascii applied on a global rather that
local basis.


Unicode doesn't deal with byte encodings; UTF-8 is an encoding,

The Unicode Standard specifies 3 UTF storage formats* and 8 UTFbyte-oriented transmission formats. UTF-8 is the most common of allencodings for web pages. (And ascii pages are utf-8 also.) It is theonly one of the 8 most of us need to much bother with. Look here for thelist

http://www.unicode.org/glossary/#U
and for details look in various places in
http://www.unicode.org/versions/Unicode6.1.0/ch03.pdf

but so are UTF-16, UTF-32.

> and as many more as you could hope for.

All the non-UTF 'as many more as you could hope for' encodings are notpart of Unicode.

* The new internal unicode scheme for 3.3 is pretty much a mixture ofthe 3 storage formats (I am of course, skipping some details) by usingthe widest one needed for each string. The advantage is avoidingproblems with each of the three. The disadvantage is greater internalcomplexity, but that should be hidden from users. They will not need tocare about the internals. They will be able to forget about 'narrow'versus 'wide' builds and the possible requirement to code differentlyfor each. There will only be one scheme that works the same on allplatforms. Most apps should require less space and about the same time.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: Python usage numbers

Reply via email to