STINNER Victor added the comment:

Ok to keep calls to ANSI versions of the Windows API when bytes filenames are 
used (so get question marks on encoding errors).

> Another alternative would be to switch to UTF-8 as the file system encoding 
> on Windows, but that change might be too incompatible.

On Linux, I tried to have more than one "OS" encoding and it was a big fail 
(search for "PYTHONFSENCODING" env var in Python history). It introduced many 
new tricky issues. In short, Python should use the same "OS encoding" 
*everyone*. Since they are many places where Python doesn't control the 
encoding, we must use the same encoding than the OS. For example, 
os.listdir(b'.') uses the ANSI code page. If you concatenate two strings, one 
encoding to UTF-8 and the other encoded to the ANSI code page, you will at 
least see mojibake, and your operation will probably fail (ex: unable to open 
the file).

I mean that forcing an encoding *everywhere* is a losing battle. There are too 
many external functions using the locale encoding on UNIX and the ANSI code 
page on Windows. Not only in the C library, think also to OpenSSL just to give 
you one example.

Anyway, bytes filenames are deprecated since Python 3.2 so it's maybe time to 
stop using them!

--

Another alternative is to completly drop support of bytes filenames on Windows 
in Python 3.5. But I expect that too many applications will just fail. It's too 
early for such disruptive change.

So I'm just closing the issue as "not a bug", because Python just follows the 
vendor choice (Microsoft decided to use funny question marks :-)).

----------
resolution:  -> not a bug
status: open -> closed

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue13247>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to