New submission from Jess Johnson <[email protected]>:
When given xml that that would raise a ParseError, but parsing is stopped
before the ParseError is raised, xml.etree.ElementTree.iterparse leaks memory.
Example:
import gc
from io import StringIO
import xml.etree.ElementTree as etree
import objgraph
def parse_xml():
xml = """
<LEVEL1>
</LEVEL1>
</ROOT>
"""
parser = etree.iterparse(StringIO(initial_value=xml))
for _, elem in parser:
if elem.tag == 'LEVEL1':
break
def run():
parse_xml()
gc.collect()
uncollected_elems = objgraph.by_type('Element')
print(uncollected_elems)
objgraph.show_backrefs(uncollected_elems, max_depth=15)
if __name__ == "__main__":
run()
Output:
[<Element 'LEVEL1' at 0x10df712c8>]
Also see this gist which has an image showing the objects that are retained in
memory: https://gist.github.com/grokcode/f89d5c5f1831c6bc373be6494f843de3
----------
components: XML
messages: 331861
nosy: jess.j
priority: normal
severity: normal
status: open
title: Memory leak in xml.etree.ElementTree.iterparse
type: resource usage
versions: Python 3.7
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue35502>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com