I managed to solve this using the following method:
"""Returns a dictionary of indexes of spectra for which there are secondary
scans, along with the indexes of those scans
"""
scans = dict()
# get an iterable
context = cElementTree.iterparse(self.info['filename'], events=("end",))
# turn it into an iterator
context = iter(context)
# get the root element
event, root = context.next()
for event, elem in context:
if event == "end" and elem.tag == self.XML_SPACE + "scan":
parentId = int(elem.get('num'))
for child in elem.findall(self.XML_SPACE + 'scan'):
childId = int(child.get('num'))
try:
indexes = scans[parentId]
except KeyError:
indexes = []
scans[parentId] = indexes
indexes.append(childId)
child.clear()
root.clear()
return scans
I think the trick is using the 'end' event to determine how much data your
iterparse is taking in, but I'm still not quite clear on whether this is the
best way to do it.
--
http://mail.python.org/mailman/listinfo/python-list