Arc Riley <arcri...@gmail.com> added the comment:

Amaury, you are absolutely correct, \ud801 is not a valid unicode glyph,
however I am not giving Python \ud801, I am giving Python '𐑑' (==
'\U00010451').

I am attaching a different short example that demonstrates that Python
is mishandling UTF-8 on both the interactive terminal and in scripts, u.py

The output should be the same, but on Python 3.1.1 compiled for wide
unicode it reports two different values.  As someone on #python-dev
found '𐑑'.encode('utf-16').decode('utf-16') outputs the correct value.

----------
Added file: http://bugs.python.org/file15032/u.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue7045>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to