On 10/12/2016 5:57 PM, Elliot Gorokhovsky wrote:
On Wed, Oct 12, 2016 at 3:51 PM Nathaniel Smith <n...@pobox.com
<mailto:n...@pobox.com>> wrote:
But this isn't relevant to Python's str, because Python's str never
uses UTF-8.
Really? I thought in python 3, strings are all unicode...
They are ...
so what encoding do they use, then?
Since 3.3, essentially ascii, latin1, utf-16 without surrogates (ucs2),
or utf-32, depending on the hightest codepoint. This is the 'kind'
field. If we go this route, I suspect that optimizing string sorting
will take some experimentation. If the initial item is str, it might be
worthwhile to record the highest 'kind' during the type scan, so that
strncmp can be used if all are ascii or latin-1.
--
Terry Jan Reedy
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/