Serhiy Storchaka <[email protected]> added the comment:
Example:
>>> '\u0100'
'Ā'
>>> '\u0100\U00010000'
'\u0100\U00010000'
>>> print('\u0100')
Ā
>>> print('\u0100\U00010000')
Traceback (most recent call last):
File "<pyshell#33>", line 1, in <module>
print('\u0100\U00010000')
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 1-1:
Non-BMP character not supported in Tk
But I think that it is too specific problem and too specific solution. It would
be better if IDLE itself escapes the string in the most appropriate way.
def utf8bmp_encode(s):
return ''.join(c if ord(c) <= 0xffff else '\\U%08x' % ord(c) for c in
s).encode('utf-8')
or
def utf8bmp_encode(s):
return re.sub('[^\x00-\uffff]', lambda m: '\\U%08x' % ord(m.group()),
s).encode('utf-8')
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue14304>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com