Vlastimil Brom <[email protected]> added the comment:
I'd like to add some further observations to the mentioned issue;
it seems, that the crash is indeed not specific to idle.
In a sample tkinter app, where I just display e.g. chr(66352) in an Entry
widget, I also get the same immediate crash via pythonw.exe and the previously
mentioned "proper" ValueError without a crash with python.exe.
I also tried to explicitly display surrogate pair, which were used
automatically until python 3.2; these can be used in tkinter in 3.3, but there
are limitations and discrepancies:
>>>
>>> got_ahsa = "\N{GOTHIC LETTER AHSA}"
>>> def wide_char_to_surrog_pair(char):
code_point = ord(char)
if code_point <= 0xFFFF:
return char
else:
high_surr = (code_point - 0x10000) // 0x400 + 0xD800
low_surr = (code_point - 0x10000) % 0x400 + 0xDC00
return chr(high_surr)+chr(low_surr)
>>> ahsa_surrog = wide_char_to_surrog_pair(got_ahsa)
>>> print(ahsa_surrog)
đ°
>>> repr(ahsa_surrog)
"'_ud800\x00udf30'"
>>> ahsa_surrog
'Pud800 udf30'
[the space in the middle of the last item might be \x00, as it terminates the
clipboard content, the rest is copied separately]
the printed square corresponds with the given character and can be used in
other programs etc. (whereas in py 3.2, the same value was used for repr and a
direct "display" of the string in the interpreter, there are three different
formats in py 3.3.
I also noticed that surogate pair is not supported as input for
unicodedata.name(...) anymore:
>>> import unicodedata
>>> unicodedata.name(ahsa_surrog)
Traceback (most recent call last):
File "<pyshell#60>", line 1, in <module>
unicodedata.name(ahsa_surrog)
TypeError: need a single Unicode character as parameter
>>>
(in 3.2 and probably others it returns the expected 'GOTHIC LETTER AHSA')
(I for my part would think, that e.g. keeping a bit liberal (but still
non-ambiguous) input possibilities for unicodedata wouldn't hurt. Also, if
tkinter is not going to support wide unicode natively any time soon, the output
conversion using surrogates, which are also understandable for other programs,
seems the most usable option in this regard.
Hopefully, this is somehow relevant for the original issue -
I am somehow not sure, whether some parts would be better posted as separate
issues, or whether this is the planned and expected behaviour anyway.
regards,
vbr
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue14200>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com