Re: Detecting filename-encoding (on WinXP)?

2006-02-10 Thread Tim N. van der Leeuw
Actually, the directory-name comes in as a URL and as such I had no
problems yet just creating a unicode-string from it which I can pass to
os.walk(), and get proper unicode-filenames back from it.
Then I can encode them into utf-8 and pass them to the database-layer
and it all works.

cheers,

--Tim

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Detecting filename-encoding (on WinXP)?

2006-02-10 Thread Christos Georgiou
On 2 Feb 2006 08:03:14 -0800, rumours say that "Tim N. van der Leeuw"
<[EMAIL PROTECTED]> might have written:

>So now what I need to know is, how do I find out in what encoding a
>particular filename is? Is there a portable way for doing this?

You said the filename comes as data, and not as contents of os.listdir(),
right?

You can only know (for almost certain) what encoding is *not* the filename
(by looping over encodings and marking those where .decode fails).  

If it was textual data, you could be more successful in guessing (btw, it's
been a long time since I requested example texts from various encodings for
my encoding-guessing app, but I was sent only one) by testing characters in
pairs and their frequencies.
-- 
TZOTZIOY, I speak England very best.
"Dear Paul,
please stop spamming us."
The Corinthians
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Detecting filename-encoding (on WinXP)?

2006-02-02 Thread Tim N. van der Leeuw
Hi Magnus,

I get the filename from a URL, which probably is not in any kind of
unicode-string but just a plain ASCII string. It should be possible to
cast this to an ASCII string -- I'll try it right away to see if this
works.

Thanks!

--Tim

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Detecting filename-encoding (on WinXP)?

2006-02-02 Thread Magnus Lycka
Tim N. van der Leeuw wrote:
> Hi,
> 
> I have a need to store directory and filenames in a database. For the
> database I chose to use UTF-8 encoding; but the actual encoding used is
> probably immaterial: whichever coding I take, I'll run into this issue
> eventually.
> 
> At first my code worked until I ran into a directory full of Cyrillic
> characters and my program blew up.

How did you find the files? Did you pass a Unicode path as argument
to os.listdir()? See http://www.python.org/peps/pep-0277.html

> So now what I need to know is, how do I find out in what encoding a
> particular filename is? Is there a portable way for doing this? And if
> not, then what is the non-portable way for doing this on Windows?
> (WinXP)
> (If there's only a non-portable way then I'll worry about porting it
> later, if and when this program will ever have a need to run on a
> Unix-like environment)

-- 
http://mail.python.org/mailman/listinfo/python-list


Detecting filename-encoding (on WinXP)?

2006-02-02 Thread Tim N. van der Leeuw
Hi,

I have a need to store directory and filenames in a database. For the
database I chose to use UTF-8 encoding; but the actual encoding used is
probably immaterial: whichever coding I take, I'll run into this issue
eventually.

At first my code worked until I ran into a directory full of Cyrillic
characters and my program blew up.

So now what I need to know is, how do I find out in what encoding a
particular filename is? Is there a portable way for doing this? And if
not, then what is the non-portable way for doing this on Windows?
(WinXP)
(If there's only a non-portable way then I'll worry about porting it
later, if and when this program will ever have a need to run on a
Unix-like environment)


Many thanks in advance,

--Tim

-- 
http://mail.python.org/mailman/listinfo/python-list