An opportunity to work in Python, and the necessity of working with some XML too large to visualize, got me thinking about an answer Alan Gauld had written to me a few years ago (https://mail.python.org/pipermail/tutor/2015-June/105810.html). I have applied that information in this script, but I have another question :)
Let's say I have an xml file like this: -------------- order.xml ---------------- <salesorder> <customername>Bob</customername> <customerlocation>321 Main St</customerlocation> <saleslines> <salesline> <item>D20</item> <quantity>4</quantity> </salesline> <salesline> <item>CS211</item> <quantity>1</quantity> </salesline> <salesline> <item>BL5</item> <quantity>7</quantity> </salesline> <salesline> <item>AC400</item> <quantity>1</quantity> </salesline> </saleslines> </salesorder> ---------- end order.xml ---------------- Items CS211 and AC400 are not valid items, and I want to remove their <salesline> nodes. I came up with the following (python 3.6.7 on linux): ------------ xml_delete_test.py -------------------- import os import xml.etree.ElementTree as ET hd = os.path.expanduser('~') inputxml = os.path.join(hd,'order.xml') outputxml = os.path.join(hd,'fixed_order.xml') valid_items = ['D20','BL5'] tree = ET.parse(inputxml) root = tree.getroot() saleslines = root.find('saleslines').findall('salesline') for e in saleslines[:]: if e.find('item').text not in valid_items: saleslines.remove(e) tree.write(outputxml) ---------- end xml_delete_test.py ------------------ The above code runs without error, but simply writes the original file to disk. The desired output would be: -------------- fixed_order.xml ---------------- <salesorder> <customername>Bob</customername> <customerlocation>321 Main St</customerlocation> <saleslines> <salesline> <item>D20</item> <quantity>4</quantity> </salesline> <salesline> <item>BL5</item> <quantity>7</quantity> </salesline> </saleslines> </salesorder> ---------- end fixed_order.xml ---------------- What I find particularly confusing about the problem is that after running xml_delete_test.py in the Idle editor, if I go over to the shell and type saleslines, I can see that it's now a list of two elements. I run the following: for i in saleslines: print(i.find('item').text) and I see that it's D20 and BL5, my two valid items. Yet when I write tree out to the disk, it has the original four. Do I need to refresh tree somehow? Thanks! _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor