On Nov 3, 2009, at 4:44 PM, Gabriel Genellina wrote:

En Tue, 03 Nov 2009 21:01:46 -0300, Kee Nethery <k...@kagi.com> escribió:

I've removed all the stuff in my code and tried to distill it down to just what is failing. Hopefully I have not removed something essential.

Sounds like I did remove something essential.


et expects bytes as input, not unicode. You're decoding too early (decoding early is good, but not in this case, because et does the work for you). Either feed et.XML with the bytes before decoding, or reencode the received xml text in UTF-8 (since this is the declared encoding).

Here is the code that hits the URL:
        getResponse1 = urllib2.urlopen(theUrl)
        getResponse2 = getResponse1.read()
        getResponse3 = unicode(getResponse2,'UTF-8')
        theResponseXml = et.XML(getResponse3)

So are you saying I want to do:
        getResponse1 = urllib2.urlopen(theUrl)
        getResponse4 = getResponse1.read()
        theResponseXml = et.XML(getResponse4)

The reason I am confused is that getResponse2 is classified as an "str" in the Komodo IDE. I want to make sure I don't lose the non- ASCII characters coming from the URL. If I do the second set of code, does elementtree auto convert the str into unicode? How do I deal with the XML as unicode when I put it into elementtree as a string?

Very confusing. Thanks for the help.

Kee
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to