New submission from Igor Nowicki <thesmilingcatofchesh...@gmail.com>:
Consider we have big XML file and we can't load it all into memory. We use then `iterparse` function from XML.etree.ElementTree module to parse it element by element. Problem is, XML doesn't allow to run this smoothly and starts outputing wrong data after loading 16 kb (16*1024, found it after looking into source code). Having large number of children, we get the information that we have just a few. To reproduce the problem, I created this example program. It makes simple xml file with progressively bigger files and tracks how many children of main objects there are counted. For small objects we have actual number, 100 children. For bigger and bigger sizes we have smaller numbers, going down to just few. ---------- components: Library (Lib) files: find_records.py messages: 333549 nosy: Igor Nowicki priority: normal severity: normal status: open title: XML.etree bug type: performance versions: Python 3.6 Added file: https://bugs.python.org/file48046/find_records.py _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue35729> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com