Re: Python parsing XML file problem with SAX

Christian Heimes Mon, 09 Aug 2010 16:42:58 -0700

Am 10.08.2010 01:20, schrieb Aahz:
> The docs say, "Parses an XML section into an element tree incrementally".
> Sure sounds like it retains the entire parsed tree in RAM.  Not good.
> Again, how do you parse an XML file larger than your available memory
> using something other than SAX?


The document at
http://www.ibm.com/developerworks/xml/library/x-hiperfparse/ explains it
one way.

The iterparser approach is ingenious but it doesn't work for every XML
format. Let's say you have a 10 GB XML file with one million <part/>
tags. An iterparser doesn't load the entire document. Instead it
iterates over the file and yields (for example) one million ElementTrees
for each <part/> tag and its children. You can get the nice API of
ElementTree with the memory efficiency of a SAX parser if you obey
"Listing 4".

Christian

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python parsing XML file problem with SAX

Reply via email to