T.R. D., 20.06.2010 08:03:
I'm trying to parse a list of xml strings and so far it looks like the
xml.parsers.expat is the way to go but I'm not quite sure how it works.
I'm trying to parse something similar to the following. I'd like to collect
all headings and bodies and associate them in a variable (dictionary for
example). How would I use the expat class to do this?
Well, you *could* use it, but I *would* not recommend it. :)
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<note>
<to>Jani</to>
<from>Tovi</from>
<heading>Reminder 2</heading>
<body>Don't forget to bring snacks!</body>
</note>
Use ElementTree's iterparse:
from xml.etree.cElementTree import iterparse
for _, element in iterparse("the_file.xml"):
if element.tag == 'note':
# find text in interesting child elements
print element.findtext('heading'), element.findtext('body')
# safe some memory by removing the handled content
element.clear()
iterparse() iterates over parser events, but it builds an in-memory XML
tree while doing so. That makes it trivial to find things in the stream.
The above code receives an event whenever a tag closes, and starts working
when the closing tag is a 'note' element, i.e. when the complete subtree of
the note element has been parsed into memory.
Stefan
_______________________________________________
Tutor maillist - [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor