On Jun 22, 2010, at 7:23 PM, Ian Bicking wrote: > This is a place where bytes+encoding might also have some benefit. XML is > someplace where you might load a bunch of data but only touch a little bit of > it, and the amount of data is frequently large enough that the efficiencies > are important.
Different encodings have different characteristics, though, which makes them amenable to different types of optimizations. If you've got an ASCII string or a latin1 string, the optimizations of unicode are pretty obvious; if you've got one in UTF-16 with no multi-code-unit sequences, you could also hypothetically cheat for a while if you're on a UCS4 build of Python. I suspect the practical problem here is that there's no CharacterString ABC in the collections module for third-party libraries to provide their own peculiarly-optimized implementations that could lazily turn into real 'str's as needed. I'd volunteer to write a PEP if I thought I could actually get it done :-\. If someone else wants to be the primary author though, I'll try to help out. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com