K.S.Sreeram wrote: > There's just NO WAY that the 10gb xml file can be loaded into memory as > a tree on any normal machine, irrespective of whether we use C or > Python. So the *only* way is to perform some kind of 'stream' processing > on the file. Perhaps using a SAX like API. So (c)ElementTree is ruled > out for this.
both ElementTree and cElementTree support "sax-style" event generation (through XMLTreeBuilder/XMLParser) and incremental parsing (through iterparse). the cElementTree versions of these are even faster than pyexpat. the iterparse interface is described here: http://effbot.org/zone/element-iterparse.htm </F> -- http://mail.python.org/mailman/listinfo/python-list