An opportunity to work in Python, and the necessity of working with some XML 
too large to visualize, got me thinking about an answer Alan Gauld had written 
to me a few years ago 
(https://mail.python.org/pipermail/tutor/2015-June/105810.html).  I have 
applied that information in this script, but I have another question :)

Let's say I have an xml file like this:

-------------- order.xml ----------------

<salesorder>
    <customername>Bob</customername>
    <customerlocation>321 Main St</customerlocation>
    <saleslines>
        <salesline>
            <item>D20</item>
            <quantity>4</quantity>
        </salesline>
        <salesline>
            <item>CS211</item>
            <quantity>1</quantity>
        </salesline>
        <salesline>
            <item>BL5</item>
            <quantity>7</quantity>
        </salesline>
        <salesline>
            <item>AC400</item>
            <quantity>1</quantity>
        </salesline>
    </saleslines>
</salesorder>

---------- end order.xml ----------------

Items CS211 and AC400 are not valid items, and I want to remove their 
<salesline> nodes.  I came up with the following (python 3.6.7 on linux):

------------ xml_delete_test.py --------------------

import os
import xml.etree.ElementTree as ET

hd = os.path.expanduser('~')
inputxml = os.path.join(hd,'order.xml')
outputxml = os.path.join(hd,'fixed_order.xml')

valid_items = ['D20','BL5']

tree = ET.parse(inputxml)
root = tree.getroot()
saleslines = root.find('saleslines').findall('salesline')
for e in saleslines[:]:
    if e.find('item').text not in valid_items:
        saleslines.remove(e)

tree.write(outputxml)

---------- end xml_delete_test.py ------------------

The above code runs without error, but simply writes the original file to disk. 
 The desired output would be:

-------------- fixed_order.xml ----------------

<salesorder>
    <customername>Bob</customername>
    <customerlocation>321 Main St</customerlocation>
    <saleslines>
        <salesline>
            <item>D20</item>
            <quantity>4</quantity>
        </salesline>
        <salesline>
            <item>BL5</item>
            <quantity>7</quantity>
        </salesline>
        </saleslines>
</salesorder>

---------- end fixed_order.xml ----------------

What I find particularly confusing about the problem is that after running 
xml_delete_test.py in the Idle editor, if I go over to the shell and type 
saleslines, I can see that it's now a list of two elements.  I run the 
following:

for i in saleslines:
        print(i.find('item').text)

and I see that it's D20 and BL5, my two valid items.  Yet when I write tree out 
to the disk, it has the original four.  Do I need to refresh tree somehow?

Thanks!
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to