In <[EMAIL PROTECTED]>, gabor wrote: > Marc 'BlackJack' Rintsch wrote: >> In <[EMAIL PROTECTED]>, Jean-Paul >> Calderone wrote: >> >>>> How would you propose listdir should behave? >>> Umm, just a wild guess, but how about raising an exception which includes >>> the name of the file which could not be decoded? >> >> Suppose you have a directory with just some files having a name that can't >> be decoded with the file system encoding. So `listdir()` fails at this >> point and raises an exception. How would you get the names then? Even the >> ones that *can* be decoded? This doesn't look very nice: >> >> path = u'some path' >> try: >> files = os.listdir(path) >> except UnicodeError, e: >> files = os.listdir(path.encode(sys.getfilesystemencoding())) >> # Decode and filter the list "manually" here. > > i agree that it does not look very nice. > > but does this look nicer? :) > > path = u'some path' > files = os.listdir(path) > > def check_and_fix_wrong_filename(file): > if isinstance(file,unicode): > return file > else: > #somehow convert it to unicode, and return it > > files = [check_and_fix_wrong_filename(f) for f in files]
I think this is very "special" code as you can't use the fixed names to open the files anymore unless you guess the encoding correctly. I think it's a bit fragile. Wouldn't it be a better solution to convert the `path` to the file system encoding for getting the file names. This way you can use all the names to process the files. > in other words, your opinion is that the proposed solution is not > optimal, or that the current behavior is fine? I think the current behavior is okay but should be documented. Maybe I just didn't had enough use cases yet that needed the names as unicode objects and from my linux file systems experience file names are just byte strings with two limitations: no slashes and no zero bytes. :-) Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list