New submission from Luyang Han <[EMAIL PROTECTED]>: when use xml.dom.pulldom module to parse a large xml file, if all the information is saved in one xml file, the module can handle it in the following way without construction the whole DOM:
events = xml.dom.pulldom.parse('file.xml') for (event, node) in events: process(event, node) But if 'file.xml' contains some large external entities, for example: <!ENTITY file_external SYSTEM "others.xml"> <body>&file_external;</body> Then using the same python snippet above leads to enormous memory usage. I did not perform a concrete benchmark, in one case a 3M external xml file drained about 1 GB memory. I think in this case it might be the whole DOM structure is constructed. ---------- components: XML messages: 66628 nosy: hanselda severity: normal status: open title: pulldom cannot handle xml file with large external entity properly type: resource usage versions: Python 2.5 __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2818> __________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com