Re: ElementTree XML parsing problem

Stefan Behnel Wed, 27 Apr 2011 23:01:09 -0700

Hegedüs Ervin, 27.04.2011 21:33:

hello,

I'm using ElementTree to parse an XML file, but it stops at the
second record (id = 002), which contains a non-standard ascii
character, ä. Here's the XML:

<?xml version="1.0"?>
<snapshot time="Mon Apr 25 08:47:23 PDT 2011">
<records>
<record id="001" education="High School" employment="7 yrs" />
<record id="002" education="Universität Bremen" employment="3 years" />
<record id="003" education="River College" employment="5 yrs" />
</records>
</snapshot>

The complaint offered up by the parser is


I've checked this xml with your script, I think your locales
settings are not good.

$ ./parse.py

XML file: test.xml
001 High School
002 Universität Bremen
003 River College

(name of xml file is "test.xml")

So, I started change the codepage mark of xml:

<?xml version="1.0" encoding="UTF-8" ?>  - same result
<?xml version="1.0" encoding="ISO-8859-2" ?>  - same result
<?xml version="1.0" encoding="ISO-8859-1" ?>  - same result

You probably changed this in an editor that supports XML and thus saves thefile in the declared encoding. Switching between the three by simplychanging the first line (the XML declaration) and not adapting the encodingof the document itself would otherwise not yield the same result for thedocument given above.


Stefan

--
http://mail.python.org/mailman/listinfo/python-list

Re: ElementTree XML parsing problem

Reply via email to