Re: Blog "about python 3"

Terry Reedy Sun, 05 Jan 2014 14:51:44 -0800

On 1/5/2014 9:23 AM, wxjmfa...@gmail.com wrote:

My examples are ONLY ILLUSTRATING, this FSR
is wrong by design,

Let me answer you a different way. If FSR is 'wrong by design', so arethe alternatives. Hence, the claim is, in itself, useless as a guide tochoosing. The choices:

* Keep the previous complicated system of buggy narrow builds on somesystems and space-wasting wide builds on other systems, with Python codepotentially acting differently on the different builds. I am sure thatyou agree that this is a bad design.

* Improved the dual-build system by de-bugging narrow builds. I proposedto do this (and gave Python code proving the idea) by adding thecomplication of an auxiliary array of indexes of astral chars in aUTF-16 string. I suspect you would call this design 'wrong' also.

* Use the memory-wasting UTF-32 (wide) build on all systems. I know youdo not consider this 'wrong', but come on. From an information theoreticand coding viewpoint, it clearly is. The top (4th) byte is *never* used.The 3rd byte is *almost never* used. The 2nd byte usage ranges fromcommon to almost never for different users.

Memory waste is also time waste, as moving information-free 0 bytestakes the same time as moving informative bytes.

Here is the beginning of the rationale for the FSR (fromhttp://www.python.org/dev/peps/pep-0393/ -- have you ever read it?).

"There are two classes of complaints about the current implementation ofthe unicode type: on systems only supporting UTF-16, users complain thatnon-BMP characters are not properly supported. On systems using UCS-4internally (and also sometimes on systems using UCS-2), there is acomplaint that Unicode strings take up too much memory - especiallycompared to Python 2.x, where the same code would often use ASCIIstrings...".

The memory waste was a reason to stick with 2.7. It could break codethat worked in 2.x. By removing the waste, the FSR makes switching toPython 3 more feasible for some people. It was a response to realproblems encountered by real people using Python. It fixed both classesof complaint about the previous system.

* Switch to the time-wasting UTF-8 for text storage, as some have done.This is different from using UTF-8 for text transmission, which I hopebecomes the norm soon.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: Blog "about python 3"

Reply via email to