Re: [Python-Dev] bytes / unicode

Glyph Lefkowitz Tue, 22 Jun 2010 17:36:18 -0700

On Jun 22, 2010, at 7:23 PM, Ian Bicking wrote:

> This is a place where bytes+encoding might also have some benefit.  XML is 
> someplace where you might load a bunch of data but only touch a little bit of 
> it, and the amount of data is frequently large enough that the efficiencies 
> are important.


Different encodings have different characteristics, though, which makes them 
amenable to different types of optimizations.  If you've got an ASCII string or 
a latin1 string, the optimizations of unicode are pretty obvious; if you've got 
one in UTF-16 with no multi-code-unit sequences, you could also hypothetically 
cheat for a while if you're on a UCS4 build of Python.

I suspect the practical problem here is that there's no CharacterString ABC in 
the collections module for third-party libraries to provide their own 
peculiarly-optimized implementations that could lazily turn into real 'str's as 
needed.  I'd volunteer to write a PEP if I thought I could actually get it done 
:-\.  If someone else wants to be the primary author though, I'll try to help 
out.

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] bytes / unicode

Reply via email to