On 1/11/2014 1:44 PM, Stephen J. Turnbull wrote:

We already *have* a type in Python 3.3 that provides text
manipulations on arrays of 8-bit objects: str (per PEP 393).

  > BTW: I don't know why so many people keep asking for use cases.
  > Isn't it obvious that text data without known (but ASCII compatible)
  > encoding or multiple different encodings in a single data chunk
  > is part of life ?

Isn't it equally obvious that if you create or read all such ASCII-
compatible chunks as (encoding='ascii', errors='surrogateescape') that
you *don't need* string APIs for bytes?

Why do these "text chunks" need to be bytes in the first place?
That's why we ask for use cases.  AFAICS, reading and writing ASCII-
compatible text data as 'latin1' is just as fast as bytes I/O.  So
it's not I/O efficiency, and (since in this model we don't do any
en/decoding on bytes/str), it's not redundant en/decoding of bytes to
str and back.

The problem with some criticisms of using 'unicode in Python 3' is that there really is no such thing. Unicode in 3.0 to 3.2 used the old internal model inherited from 2.x. Unicode in 3.3+ uses a different internal model that is a game changer with respect to certain issues of space and time efficiency (and cross-platform correctness and portability). So at least some the valid criticisms based on the old model are out of date and no longer valid.

--
Terry Jan Reedy

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to