Hi,
I took a look at the unicode handling in freevo, and attached is a small patch
to fix a couple of things:
1. urllib.quote() only handles strings, not unicode strings
2. Add sitecustomize.py to set the freevo default encoding to 'utf-8', so that
each "".encode() call does not need to specify it. Note that encoding should
not be hard coded all over the place, there are things like:
search_string = '%s %s' % (artist.encode('latin-1'), album.encode('latin-1'))
in many places. The String() and Unicode() helper functions should usually be
used instead when necessary.
3. I added another fallback to Unicode() helper function to use 'iso-8859-15'
if the encoding to unicode fails with the default (utf-8). This I did to
handle the filenames. Problem with filesystems is that those are not usually
unicode aware. That means the user's locale defines how the filenames are
encoded. So if my locale is iso-8859-15, a name like "tämä" will have
different bytes on disk than if my locale is "utf-8". This happens probably
most often when moving files between machines having different locales, but
you can simulate the effect with something like:
os.mkdir("Andr\xe9".decode("latin-1").encode("latin-1"))
os.mkdir("Andr\xe9".decode("latin-1").encode("utf-8"))
That will give you two directories "André", but with different encodings.
Dealing with unicode and different encodings can be sometimes confusing.
Personally I find it helpful to think of it as follows:
There is usually a pair, unicode string and raw string. The unicode string
includes metadata, it knows about its encoding. The raw python string is
just a bytestream. The convention is that the bytestream contains ascii, but
it can contain anything.
So unicode("abc") will take the bytestream "abc" and turn it to a unicode
string, and all is well as the default encoding is ascii.
Now unicode ("äläpäs") will fail, unless you have changed the default
encoding. It is equivalent to "äläpäs".decode(). But the string is not pure
ascii, and thus it bails out with something like "UnicodeDecodeError: 'ascii'
codec can't decode byte ..."
So if you have a raw python string containing anything more exotic than ascii,
and you want to convert it to unicode, you must explicitly tell the encoding
of the string. You can also change the default encoding from ascii to
something else, but only in site.py or sitecustomize.py
Another curve ball is the user locale, what you type in the terminal can look
the same to you, but without checking it is impossible to say if the
representation on disk will be the same. For instance, Mandriva 2007 seems
to default to utf-8 encoding, and that will result in filenames with accents
being different. Furthermore, if you write something using utf-8 in kwrite,
and give the resulting file to your friend who is using latin-1, the contents
will not render properly -- he will not see your accents before changing his
encoding.
Hope this helps,
Harri
Index: src/www/htdocs/library.rpy
===================================================================
--- src/www/htdocs/library.rpy (revision 8806)
+++ src/www/htdocs/library.rpy (working copy)
@@ -344,7 +344,6 @@
# get me the directories to output
directorylist = util.getdirnames(String(action_dir))
for mydir in directorylist:
- mydir = Unicode(mydir)
fv.tableRowOpen('class="chanrow"')
mydispdir = os.path.basename(mydir)
mydirlink = '<a href="'+ action_script +'?media='+action_mediatype+'&dir='+urllib.quote(mydir)+'">'+mydispdir+'</a>'
Index: src/sitecustomize.py
===================================================================
--- src/sitecustomize.py (revision 0)
+++ src/sitecustomize.py (revision 0)
@@ -0,0 +1,12 @@
+# -*- coding: iso-8859-1 -*-
+# -----------------------------------------------------------------------
+# sitecustomize.py - Automatically imported if present
+# Set the default encoding for freevo to be more
+# generic that the default ascii.
+# See http://docs.python.org/lib/module-site.html
+# -----------------------------------------------------------------------
+# $Id$
+
+import sys
+
+sys.setdefaultencoding('utf-8')
Index: src/util/__init__.py
===================================================================
--- src/util/__init__.py (revision 8806)
+++ src/util/__init__.py (working copy)
@@ -54,9 +54,12 @@
try:
return unicode(string, config.LOCALE)
except Exception, e:
- print 'Error: Could not convert %s to unicode' % repr(string)
- print 'tried encoding %s and %s' % (encoding, config.LOCALE)
- print e
+ try:
+ return unicode(string, "iso-8859-15")
+ except Exception, e:
+ print 'Error: Could not convert %s to unicode' % repr(string)
+ print 'tried encoding %s and %s' % (encoding, config.LOCALE)
+ print e
elif string.__class__ != unicode:
return unicode(str(string), config.LOCALE)
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Freevo-users mailing list
Freevo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-users