New submission from STINNER Victor <victor.stin...@haypocalc.com>: In Python 3.2, mbcs encoding (default filesystem encoding on Windows) is now strict: raise an error on unencodable/undecodable characters/bytes. But os.listdir(b'.') encodes unencodable bytes as b'?'.
Example: >>> os.mkdir('listdir') >>> open('listdir\\xxx-\u0363', 'w').close() >>> filename = os.listdir(b'listdir')[0] >>> filename b'xxx-?' >>> open(filename, 'r').close() IOError: [Errno 22] Invalid argument: 'xxx-?' os.listdir(b'listdir') should raise an error (and not ignore the filename or replaces unencodable characters by b'?'). I think that we should list the directory using the wide character API (FindFirstFileW) but encode the filename using PyUnicode_EncodeFSDefault() if the directory name type is bytes, instead of using the ANSI API (FindFirstFileA). ---------- components: Library (Lib), Unicode, Windows messages: 115995 nosy: haypo, loewis priority: normal severity: normal status: open title: Windows : os.listdir(b'.') doesn't raise an error for unencodable filenames versions: Python 3.2 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue9820> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com