Antoine Pitrou added the comment:
With the system Python on s10:
Python 2.6.8 (unknown, Apr 13 2012, 17:08:12) [C] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.strxfrm('a')
'a'
>>> locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
'en_US.UTF-8'
>>> locale.strxfrm('a')
'\x01\x01\x01\x0e\x01\x01\x01\x01\x01\x01\x01\x02\x01\x01\x0fi\x01\x01\x01\x01'
>>> locale.strxfrm('a').decode('utf-8')
u'\x01\x01\x01\x0e\x01\x01\x01\x01\x01\x01\x01\x02\x01\x01\x0fi\x01\x01\x01\x01'
The difference between Python 2 and Python 3 is that Python 3 uses wcsxfrm, not
strxfrm. Apparently Solaris' wcsxfrm is some broken thing that returns the same
thing as strxfrm, cast to a wchar_t *, hence the character U+101010e
(corresponding to the '\x01\x01\x01\x0e' bytestring above).
----------
nosy: +loewis, pitrou
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue16258>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com