jeffunit <j...@jeffunit.com> wrote: >>That looks like a "surrogate escape" (See PEP 383) >>http://www.python.org/dev/peps/pep-0383/. It indicates the wrong >>encoding was used to decode the filename. > > That seems likely. How do I set the encoding to something correct to > decode the filename? > > Clearly windows knows how to display it. > I suspect since I complied python with cygwin, that it is using a > POSIX standard, > rather than a windows specific standard. Of course ideally, I would > like my code to work > on linux as well as windows, as I back up all of my data to a linux > machine with > samba. > If you are running on a Linux system then the filenames are stored encoded as bytes but the system does not store the encoding. In fact different files in the same directory could use different encodings. That's why Python 3.1 uses the surrogate escapes so that you can at least work with the files even if you can't display the filenames.
If you are running on Windows and using the native Python to access an NTFS formatted partition then there shouldn't be a problem: the filenames are stored as unicode and Python uses the unicode apis. Of course you may still not be able to display the filenames if they contain characters not available in your output codepage. If you use cygwin a quick search on Google turned up some old discussions implying that it uses the 8 bit apis which convert characters using the current codepage and converts characters it cannot handle to '?' but I have no idea if that still applies. -- http://mail.python.org/mailman/listinfo/python-list