On Fri, Mar 18, 2016, at 03:00, Ian Kelly wrote: > jmf has been asked this before, and as I recall he seems to feel that > UTF-8 should be used for all purposes, ignoring the limitations of > that encoding such as that indexing becomes a O(n) operation.
Just to play devil's advocate, here, why is it so bad for indexing to be O(n)? Some simple caching is all that's needed to prevent it from making iteration O(n^2), if that's what you're worried about. Emacs' "multibyte string" type does this. (among other trickery to represent non-unicode characters and "raw bytes" as code points above 10FFFF - raw bytes are 3FFF80-3FFFFF) -- https://mail.python.org/mailman/listinfo/python-list