On Wed, Feb 23, 2005 at 01:03:56AM -0600, Kenneth Pronovici wrote: [snip] > Today, I accidentally ran across a directory containing three "normal" > files (with ASCII filenames) and one file with a two-character unicode > filename. My code, which was doing something like this: > > for entry in os.listdir(path): # path is <type 'unicode'> > entrypath = os.path.join(path, entry) > > suddenly started blowing up with the dreaded unicode error: > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in > position 1: ordinal not in range(128)
Sorry to reply to my own note, but after sleeping on it, I think I've come up with a reasonable solution. Now that I've dug further and my eyes are less bleery, everything seems to work as long as I only pass in simple strings to the filesystem functions. I think that I can solve my problem by just converting any unicode strings from configuration into utf-8 simple strings using encode(). Using this solution, all of my existing regression tests still pass, and my code seems to make it past the unusual directory. > [u'README.strange-name', '\xe2\x99\xaa\xe2\x99\xac', > u'utflist.long.gz', u'utflist.cp437.gz', u'utflist.short.gz'] > > Note that in this second result, element [1] is not a unicode string > while the other three elements are. I'm still confused as to why this happens, but since I work around it, I guess I don't care so much. Thanks, KEN -- Kenneth J. Pronovici <[EMAIL PROTECTED]> -- http://mail.python.org/mailman/listinfo/python-list