On Fri, 2006-12-01 at 16:36 +0100, Duncan Webb wrote: > First, when a string is a Unicode string does this mean that every > character is 2 or 4 bytes wide?
Not necessarily. Depends on the encoding. This isn't the case for latin1 and UTF8. > Second, file names from a fat system seem to be in latin1 but on the > ext2/3 are in utf8. How can they be processed in a safe way without > causing UnicodeErrors? Firstly, the encoding type is not always utf8 on ext3. The filesystem encoding can be gotten via sys.getfilesystemencoding(), but that doesn't mean a filename isn't encoded latin1 anyway. Consequently, you must never use unicode for storing filenames, and always keep them as str objects. For purposes of displaying a filename you can then convert to unicode for proper display. kaa.strutils.str_to_unicode attempts to do the right thing when you don't know whether a string is encoded latin1 or utf8. (kaa.strutils is in kaa.base, you can just copy that function into the 1.x tree if you need it.) Jason. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Freevo-devel mailing list Freevo-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freevo-devel