Hi All,

I got trouble parsing xml returned by web service. XML data contains
characters above 128, so ET.fromstring throws an error. Error is
thrown from python's xmllib.py file, where it detects char above 128.
I am replacing utf-8 encoding string in returned xml with
'ISO-8859-1', and then I call .encode with ISO-8859-1 param. Still I
get the parsing error, illegal character.
What's interesting is that if I define a string const and assign it
the value returned from the service request it gets parsed. I.e. the
following gets parsed ok.


TEST_EVNVELOPE2 = """<?xml version="1.0" encoding="ISO-8859-1"?>
                     <soap:Envelope xmlns:soap="http://
schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/
XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema";>
                     <soap:Body><GetResponse xmlns="http://
tempuri.org/">
                     <GetResult>&lt;?xml version="1.0"
encoding="ISO-8859-1"?&gt;&lt;Response&gt;&lt;Entity Name="Accounts"
Current="00300571BDF91DDCA7D1320EE5C78877"&gt;&lt;Field Name="Name"
Value="Bad und WA¤rmetechnik FA_hrwirt GmbH"/&gt;&lt;/Instance&gt;&lt;/
Entity&gt;&lt;/Response&gt;</GetResult></GetResponse>
                     </soap:Body></soap:Envelope>"""

CHARSET = 'ISO-8859-1'

      ET.XMLTreeBuilder = SimpleXMLTreeBuilder.TreeBuilder
      spEnv = TEST_EVNVELOPE2
      spEnv = spEnv.replace('utf-16', CHARSET)
      spEnv = spEnv.replace('utf-8', CHARSET)
      dom=ET.fromstring(spEnv.encode(CHARSET))

however when spEnv is assigned response.content directly I get parsing
error in ET.fromstring, even though TEST_EVNVELOPE2 value is just
pasted from the browser's src code, that is I just dumped
response.content to rendered html and then copied it from html src.
Why it works as a  string const and not as variable value?

So, what's the correct way to make parsing work?

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to