STINNER Victor <victor.stin...@haypocalc.com> added the comment: > According to the Unicode standard the high and low surrogate halves used > by UTF-16 (...)
Yes, but in Python, U+DC80..D+DCFF range is used to store undecodable bytes. Eg. 'abc\xff'.decode('ascii', 'surrogateescape') gives 'abc\udcff'. > Anyway, as you remark, my approach is a _patch_, designed to make python > (2.x) work in an unicode environment, with the least amount of code > change, for those willing to commit such a patch. Python 2.7 is out and I think it is too late to fix Python2. Anyway, Python2 uses bytes for sys.path or other paths, so the problem only occurs if the user specifies unicode paths. > In 3.x you may want to do things differently. I choosed to rewrite the C code to manipulate unicode paths instead of byte paths => #9425 ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue1552880> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com