Am 10.08.2010 01:20, schrieb Aahz: > The docs say, "Parses an XML section into an element tree incrementally". > Sure sounds like it retains the entire parsed tree in RAM. Not good. > Again, how do you parse an XML file larger than your available memory > using something other than SAX?
The document at http://www.ibm.com/developerworks/xml/library/x-hiperfparse/ explains it one way. The iterparser approach is ingenious but it doesn't work for every XML format. Let's say you have a 10 GB XML file with one million <part/> tags. An iterparser doesn't load the entire document. Instead it iterates over the file and yields (for example) one million ElementTrees for each <part/> tag and its children. You can get the nice API of ElementTree with the memory efficiency of a SAX parser if you obey "Listing 4". Christian -- http://mail.python.org/mailman/listinfo/python-list