New submission from STINNER Victor <[email protected]>:
In Python 3.2, mbcs encoding (default filesystem encoding on Windows) is now
strict: raise an error on unencodable/undecodable characters/bytes. But
os.listdir(b'.') encodes unencodable bytes as b'?'.
Example:
>>> os.mkdir('listdir')
>>> open('listdir\\xxx-\u0363', 'w').close()
>>> filename = os.listdir(b'listdir')[0]
>>> filename
b'xxx-?'
>>> open(filename, 'r').close()
IOError: [Errno 22] Invalid argument: 'xxx-?'
os.listdir(b'listdir') should raise an error (and not ignore the filename or
replaces unencodable characters by b'?').
I think that we should list the directory using the wide character API
(FindFirstFileW) but encode the filename using PyUnicode_EncodeFSDefault() if
the directory name type is bytes, instead of using the ANSI API
(FindFirstFileA).
----------
components: Library (Lib), Unicode, Windows
messages: 115995
nosy: haypo, loewis
priority: normal
severity: normal
status: open
title: Windows : os.listdir(b'.') doesn't raise an error for unencodable
filenames
versions: Python 3.2
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue9820>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com