Re: elementtree XML() unicode

Gabriel Genellina Tue, 03 Nov 2009 18:18:48 -0800

En Tue, 03 Nov 2009 21:01:46 -0300, Kee Nethery <[email protected]> escribió:

Having an issue with elementtree XML() in python 2.6.4.

This code works fine:

      from xml.etree import ElementTree as et
getResponse = u'''<?xml version="1.0" encoding="UTF-8"?><customer><shipping><state>bobble</state><city>head</city><street>city</street></shipping></customer>'''
      theResponseXml = et.XML(getResponse)

This code errors out when it tries to do the et.XML()

      from xml.etree import ElementTree as et
getResponse = u'''<?xml version="1.0" encoding="UTF-8"?><customer><shipping><state>\ue58d83\ue89189\ue79c8C</state><city>\ue69f8f\ue5b882</city><street>\ue9ab98\ue58d97\ue58fb03</street></shipping></customer>'''
      theResponseXml = et.XML(getResponse)
In my real code, I'm pulling the getResponse data from a web page thatreturns as XML and when I display it in the browser you can see theJapanese characters in the data. I've removed all the stuff in my codeand tried to distill it down to just what is failing. Hopefully I havenot removed something essential.
Why is this not working and what do I need to do to use Elementtree withunicode?

et expects bytes as input, not unicode. You're decoding too early(decoding early is good, but not in this case, because et does the workfor you). Either feed et.XML with the bytes before decoding, or reencodethe received xml text in UTF-8 (since this is the declared encoding).


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list

Re: elementtree XML() unicode

Reply via email to