On 2008-12-09 09:41, Anders J. Munch wrote: > On Sun, Dec 7, 2008 at 3:53 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: >>>> try: >>>> files = os.listdir(somedir, errors = strict) >>>> except OSError as e: >>>> log(<verbose error message that includes somedir and e>) >>>> files = os.listdir(somedir) > > Instead of a codecs error handler name, how about a callback for > converting bytes to str? > > os.listdir(somedir, decoder=bytes.decode) > os.listdir(somedir, decoder=lambda b: b.decode(preferredencoding, > errors='xmlcharrefreplace')) > os.listdir(somedir, decoder=repr) > > ISTM that would be simpler and more flexible than going over the > codecs registry. One caveat though is that there's no obvious way of > telling listdir to skip a name. But if the default behaviour for > decoder=None is to skip with a warning, then the need to explicitly > ask for files to be skipped would be small. > > Terry's example would then be: > >>>> try: >>>> files = os.listdir(somedir, decoder=bytes.decode) >>>> except UnicodeDecodeError as e: >>>> log(<verbose error message that includes somedir and e>) >>>> files = os.listdir(somedir)
Well, this is not too far away from just putting the whole decoding logic into the application directly: files = [filename.decode(filesystemencoding, errors='warnreplace') for filename in os.listdir(dir)] (or os.listdirb() if that's where the discussion is heading) ... and that also tells us something about this discussion: we're trying to come up with some magic to work around writing two lines of Python code. I'd just have all the os APIs return bytes and leave whatever conversion to Unicode might be necessary to a higher level API. Think of it: You really only need the Unicode values if you ever want to output those values in text form somewhere. In those cases, it's usually a human reading a log file or screen output. Most other cases, just care about getting some form of file identifier in order to open the file and don't really care about the encoding of the file name at all. It's probably better to have a two helper functions in the os module that take care of the conversion on demand rather than trying to force this conversion even in cases where the application never really needs to write the filename somewhere, e.g. os.decodefilename() and os.encodefilename(). These should then provide some reasonable default logic, e.g. use a 'warnreplace' error handler. Applications are then free to use these converters or implement their own. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 09 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2008-12-02: Released mxODBC.Connect 1.0.0 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com