2008/12/13 John Machin <[email protected]>: > > Python 2.6.1 (r261:67517, Dec 4 2008, 16:51:00) [MSC v.1500 32 bit > (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> x = u'\u9876' >>>> x > u'\u9876' > > # As expected > > Python 3.0 (r30:67507, Dec 3 2008, 20:14:27) [MSC v.1500 32 bit > (Intel)] on win 32 > Type "help", "copyright", "credits" or "license" for more information. >>>> x = '\u9876' >>>> x > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "C:\python30\lib\io.py", line 1491, in write > b = encoder.encode(s) > File "C:\python30\lib\encodings\cp850.py", line 19, in encode > return codecs.charmap_encode(input,self.errors,encoding_map)[0] > UnicodeEncodeError: 'charmap' codec can't encode character '\u9876' in > position > 1: character maps to <undefined> > > # *NOT* as expected (by me, that is) > > Is this the intended outcome? > -- > http://mail.python.org/mailman/listinfo/python-list >
I also found this a bit surprising, but it seems to be the intended behaviour (on a non-unicode console) http://docs.python.org/3.0/whatsnew/3.0.html "PEP 3138: The repr() of a string no longer escapes non-ASCII characters. It still escapes control characters and code points with non-printable status in the Unicode standard, however." I get the same error in windows cmd, (Idle prints the respective glyph correctly). To get the old behaviour of repr, one can use ascii, I suppose. Python 3.0 (r30:67507, Dec 3 2008, 20:14:27) [MSC v.1500 32 bit (Intel)] on win 32 Type "help", "copyright", "credits" or "license" for more information. >>> repr('\u9876') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python30\lib\io.py", line 1491, in write b = encoder.encode(s) File "C:\Python30\lib\encodings\cp852.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_map)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u9876' in position 2: character maps to <undefined> >>> '\u9876'.encode("unicode-escape") b'\\u9876' >>> ascii('\u9876') "'\\u9876'" >>> -- http://mail.python.org/mailman/listinfo/python-list
