On Fri, 2006-12-01 at 16:36 +0100, Duncan Webb wrote:
> First, when a string is a Unicode string does this mean that every
> character is 2 or 4 bytes wide?

Not necessarily.  Depends on the encoding.  This isn't the case for
latin1 and UTF8.

> Second, file names from a fat system seem to be in latin1 but on the
> ext2/3 are in utf8. How can they be processed in a safe way without
> causing UnicodeErrors?

Firstly, the encoding type is not always utf8 on ext3.  The filesystem
encoding can be gotten via sys.getfilesystemencoding(), but that doesn't
mean a filename isn't encoded latin1 anyway.  Consequently, you must
never use unicode for storing filenames, and always keep them as str
objects.

For purposes of displaying a filename you can then convert to unicode
for proper display.  kaa.strutils.str_to_unicode attempts to do the
right thing when you don't know whether a string is encoded latin1 or
utf8.  (kaa.strutils is in kaa.base, you can just copy that function
into the 1.x tree if you need it.)

Jason.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Freevo-devel mailing list
Freevo-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-devel

Reply via email to