kyrian (List) wrote:
>
>The gist of the problem seems to be that you need to treat the strings
>as utf-8 or iso-8859-1 encoded 'objects' rather than standard ASCII
>string types within the code, and I don't know for sure how to do that.
And you have to know which because there are iso-8859-1 encoded
characters which aren't valid utf-8 codes and there are utf-8 encoded
characters which get garbled if decoded as iso-8859-1.
Thus, code like
try:
unicode(value, "ascii")
except UnicodeError:
value = unicode(value, "utf-8")
else:
# value was valid ASCII data
pass
which I think is no different from simply
value = unicode(value, "utf-8")
since if value is ascii to begin with, calling it utf-8 is OK,
doesn't work if value is actually iso-8859-1 encoded and contains bytes
which aren't valid utf-8 or which decode differently from utf-8.
--
Mark Sapiro <[email protected]> The highway is for gamblers,
San Francisco Bay Area, California better use your sense - B. Dylan
_______________________________________________
Mailman-Developers mailing list
[email protected]
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives:
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe:
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org
Security Policy: http://wiki.list.org/x/QIA9