Re: Detecting filename-encoding (on WinXP)?
Actually, the directory-name comes in as a URL and as such I had no problems yet just creating a unicode-string from it which I can pass to os.walk(), and get proper unicode-filenames back from it. Then I can encode them into utf-8 and pass them to the database-layer and it all works. cheers, --Tim -- http://mail.python.org/mailman/listinfo/python-list
Re: Detecting filename-encoding (on WinXP)?
On 2 Feb 2006 08:03:14 -0800, rumours say that "Tim N. van der Leeuw" <[EMAIL PROTECTED]> might have written: >So now what I need to know is, how do I find out in what encoding a >particular filename is? Is there a portable way for doing this? You said the filename comes as data, and not as contents of os.listdir(), right? You can only know (for almost certain) what encoding is *not* the filename (by looping over encodings and marking those where .decode fails). If it was textual data, you could be more successful in guessing (btw, it's been a long time since I requested example texts from various encodings for my encoding-guessing app, but I was sent only one) by testing characters in pairs and their frequencies. -- TZOTZIOY, I speak England very best. "Dear Paul, please stop spamming us." The Corinthians -- http://mail.python.org/mailman/listinfo/python-list
Re: Detecting filename-encoding (on WinXP)?
Hi Magnus, I get the filename from a URL, which probably is not in any kind of unicode-string but just a plain ASCII string. It should be possible to cast this to an ASCII string -- I'll try it right away to see if this works. Thanks! --Tim -- http://mail.python.org/mailman/listinfo/python-list
Re: Detecting filename-encoding (on WinXP)?
Tim N. van der Leeuw wrote: > Hi, > > I have a need to store directory and filenames in a database. For the > database I chose to use UTF-8 encoding; but the actual encoding used is > probably immaterial: whichever coding I take, I'll run into this issue > eventually. > > At first my code worked until I ran into a directory full of Cyrillic > characters and my program blew up. How did you find the files? Did you pass a Unicode path as argument to os.listdir()? See http://www.python.org/peps/pep-0277.html > So now what I need to know is, how do I find out in what encoding a > particular filename is? Is there a portable way for doing this? And if > not, then what is the non-portable way for doing this on Windows? > (WinXP) > (If there's only a non-portable way then I'll worry about porting it > later, if and when this program will ever have a need to run on a > Unix-like environment) -- http://mail.python.org/mailman/listinfo/python-list
Detecting filename-encoding (on WinXP)?
Hi, I have a need to store directory and filenames in a database. For the database I chose to use UTF-8 encoding; but the actual encoding used is probably immaterial: whichever coding I take, I'll run into this issue eventually. At first my code worked until I ran into a directory full of Cyrillic characters and my program blew up. So now what I need to know is, how do I find out in what encoding a particular filename is? Is there a portable way for doing this? And if not, then what is the non-portable way for doing this on Windows? (WinXP) (If there's only a non-portable way then I'll worry about porting it later, if and when this program will ever have a need to run on a Unix-like environment) Many thanks in advance, --Tim -- http://mail.python.org/mailman/listinfo/python-list