New submission from STINNER Victor <victor.stin...@haypocalc.com>:

In Python 3.2, mbcs encoding (default filesystem encoding on Windows) is now 
strict: raise an error on unencodable/undecodable characters/bytes. But 
os.listdir(b'.') encodes unencodable bytes as b'?'.

Example:

>>> os.mkdir('listdir')
>>> open('listdir\\xxx-\u0363', 'w').close()
>>> filename = os.listdir(b'listdir')[0]
>>> filename
b'xxx-?'
>>> open(filename, 'r').close()
IOError: [Errno 22] Invalid argument: 'xxx-?'

os.listdir(b'listdir') should raise an error (and not ignore the filename or 
replaces unencodable characters by b'?').

I think that we should list the directory using the wide character API 
(FindFirstFileW) but encode the filename using PyUnicode_EncodeFSDefault() if 
the directory name type is bytes, instead of using the ANSI API 
(FindFirstFileA).

----------
components: Library (Lib), Unicode, Windows
messages: 115995
nosy: haypo, loewis
priority: normal
severity: normal
status: open
title: Windows : os.listdir(b'.') doesn't raise an error for unencodable 
filenames
versions: Python 3.2

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9820>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to