Re: [Python-Dev] Internal representation of strings and Micropython

Hrvoje Niksic Fri, 06 Jun 2014 01:59:22 -0700

On 06/04/2014 05:52 PM, Mark Lawrence wrote:

On 04/06/2014 16:32, Steve Dower wrote:


If copying into a separate list is a problem (memory-wise), re.finditer('\\S+', 
string) also provides the same behaviour and gives me the sliced string, so 
there's no need to index for anything.


Out of idle curiosity is there anything that stops MicroPython, or any
other implementation for that matter, from providing views of a string
rather than copying every time?  IIRC memoryviews in CPython rely on the
buffer protocol at the C API level, so since strings don't support this
protocol you can't take a memoryview of them.  Could this actually be
implemented in the future, is the underlying C code just too
complicated, or what?


Memory view of Unicode strings is controversial for two reasons:

1. It exposes the internal representation of the string. If memoryviewsof strings were supported in Python 3, PEP 393 would not have beenpossible (without breaking that feature).

2. Even if it were OK to expose the internal representation, it mightnot be what the users expect. For example, memoryview("Hrvoje") wouldreturn a view of a 6-byte buffer, while memoryview("Nikšić") wouldreturn a view of a 12-byte UCS-2 buffer. The user of a memory view mightexpect to get UCS-2 (or UCS-4, or even UTF-8) in all cases.

An implementation that decided to export strings as memory views mightbe forced to make a decision about internal representation of strings,and then stick to it.

The byte objects don't have these issues, which is why in Python 2.7memoryview("foo") works just fine, as does memoryview(b"foo") in Python 3.


_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

Reply via email to