STINNER Victor <victor.stin...@haypocalc.com> added the comment:

> According to the Unicode standard the high and low surrogate halves used
> by UTF-16 (...)

Yes, but in Python, U+DC80..D+DCFF range is used to store undecodable bytes. 
Eg. 'abc\xff'.decode('ascii', 'surrogateescape') gives 'abc\udcff'.

> Anyway, as you remark, my approach is a _patch_, designed to make python
> (2.x) work in an unicode environment, with the least amount of code
> change, for those willing to commit such a patch.

Python 2.7 is out and I think it is too late to fix Python2. Anyway, Python2 
uses bytes for sys.path or other paths, so the problem only occurs if the user 
specifies unicode paths.

> In 3.x you may want to do things differently.

I choosed to rewrite the C code to manipulate unicode paths instead of byte 
paths => #9425

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue1552880>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to