T.R. D., 20.06.2010 08:03:
I'm trying to parse a list of xml strings and so far it looks like the
xml.parsers.expat is the way to go but I'm not quite sure how it works.

I'm trying to parse something similar to the following.  I'd like to collect
all headings and bodies and associate them in a variable (dictionary for
example). How would I use the expat class to do this?

Well, you *could* use it, but I *would* not recommend it. :)


<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

<note>
<to>Jani</to>
<from>Tovi</from>
<heading>Reminder 2</heading>
<body>Don't forget to bring snacks!</body>
</note>

Use ElementTree's iterparse:

    from xml.etree.cElementTree import iterparse

    for _, element in iterparse("the_file.xml"):
        if element.tag == 'note':
            # find text in interesting child elements
            print element.findtext('heading'), element.findtext('body')

            # safe some memory by removing the handled content
            element.clear()

iterparse() iterates over parser events, but it builds an in-memory XML tree while doing so. That makes it trivial to find things in the stream. The above code receives an event whenever a tag closes, and starts working when the closing tag is a 'note' element, i.e. when the complete subtree of the note element has been parsed into memory.

Stefan

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to