K.S.Sreeram wrote:

> There's just NO WAY that the 10gb xml file can be loaded into memory as
> a tree on any normal machine, irrespective of whether we use C or
> Python. So the *only* way is to perform some kind of 'stream' processing
> on the file. Perhaps using a SAX like API. So (c)ElementTree is ruled
> out for this.

both ElementTree and cElementTree support "sax-style" event generation 
(through XMLTreeBuilder/XMLParser) and incremental parsing (through 
iterparse).  the cElementTree versions of these are even faster than 
pyexpat.

the iterparse interface is described here:

     http://effbot.org/zone/element-iterparse.htm

</F>

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to