Re: File names, character sets and Unicode

2008-12-13 Thread Дамјан Георгиевски
> In a nutshell, this is likely to cause pain until all file systems are > standardized on a particular encoding of Unicode. Probably only about > another fifteen years to go ... well, most Linux distros are defaulting to a UTF-8 locale now, the exception beeing Gentoo&similar that expect the use

Re: File names, character sets and Unicode

2008-12-12 Thread Steve Holden
Michal Ludvig wrote: > Hi all, > > is there any way to determine what's the charset of filenames returned > by os.walk()? > > The trouble is, if I pass argument to os.walk() I get the > filenames as byte-strings. Possibly UTF-8 encoded Unicode, who knows. > > OTOH If I pass to os.walk() all th

Re: File names, character sets and Unicode

2008-12-12 Thread Marc 'BlackJack' Rintsch
On Fri, 12 Dec 2008 23:32:27 +1300, Michal Ludvig wrote: > is there any way to determine what's the charset of filenames returned > by os.walk()? No. Especially under *nix file systems file names are just a string of bytes, not characters. It is possible to have file names in different encond

File names, character sets and Unicode

2008-12-12 Thread Michal Ludvig
Hi all, is there any way to determine what's the charset of filenames returned by os.walk()? The trouble is, if I pass argument to os.walk() I get the filenames as byte-strings. Possibly UTF-8 encoded Unicode, who knows. OTOH If I pass to os.walk() all the filenames I get in the loop are alrea