Piet van Oostrum wrote:
Dave Angel <da...@dejaviewphoto.com> (DA) wrote:

DA> Works for me:

DA> rrr = downcode(u"Žabovitá zmiešaná kaša")
DA> print repr(rrr)
DA> print rrr

DA> prints out:

DA> u'Zabovita zmiesana kasa'
DA> Zabovita zmiesana kasa

DA> I did have to add an encoding declaration as line 2 of the file:

DA> #-*- coding: latin-1 -*-

DA> and I had to convince my editor (Komodo) to save the file in utf-8.

*Seems to work*.
If you save in utf-8 the coding declaration also has to be utf-8.
Besides, many of these characters won't be representable in latin-1.
The reason it worked is that these characters were translated into two-
or more-bytes sequences and replace did work with these. But it's
dangerous, as they are then no longer the unicode characters they were
intended to be.
Thanks for the correction. What I meant by "works for me" is that the single example in the docstring translated okay. But I do have a lot to learn about using Unicode in sources, and I want to learn.

So tell me, how were we supposed to guess what encoding the original message used? I originally had the mailing list message (in Thunderbird email). When I copied (copy/paste) to Komodo IDE (text editor), it wouldn't let me save because the file type was ASCII. So I randomly chosen latin-1 for file type, and it seemed to like it.

At that point I expected and got errors from Python because I had no coding declaration. I used latin-1, and still had problems, though I forget what they were. Only when I changed the file encoding type again, to utf-8, did the errors go away. I agree that they should agree, but I don't know how to reconcile the copy/paste boundary, the file type (without BOM, which is another variable), the coding declaration, and the stdout implicit ASCII encoding. I understand a bunch of it, but not enough to be able to safely walk through the choices.

Is this all written up in one place, to where an experienced programmer can make sense of it? I've nibbled at the edges (even wrote a UTF-8 encoder/decoder a dozen years ago).

DaveA
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to