Hello,

I run this script to remove unneeded elements.

For some reason, the input file is left as-is, when I try to get rid of the <metadata> block; If works as expected when I ignore that element.

Any idea why?

Thank you.

========== INPUT.GPX

<?xml version="1.0" encoding="UTF-8"?>
<gpx version="1.1" creator="GPSBabel - http://www.acme.com"; xmlns="http://www.topografix.com/GPX/1/1"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
  <metadata>
    <time>2022-07-21T20:29:48.309Z</time>
    <bounds minlat="44.456597300" minlon="3.007453400" maxlat="45.803699000" maxlon="5.251047400"/>
  </metadata>
  <wpt lat="45.042569200" lon="5.040802200">
    <name>way/4749044</name>
    <cmt>landuse=cemetery</cmt>
    <desc>landuse=cemetery</desc>
    <link href="http://osm.org/browse/way/4749044"/>
  </wpt>
</gpx>

========== CLEAN.PY
import lxml.etree as et
from lxml import html
#from xml.etree.ElementTree import XML, fromstring
#import re
import sys
import os

item= "input.gpx"
BASENAME, EXTENSION = os.path.splitext(os.path.basename(item))
OUTPUTFILE = f"{BASENAME}.EDITED{EXTENSION}"
PATH=os.path.dirname(os.path.abspath(item))
OUTPUT_FULLPATH=f"{PATH}\{OUTPUTFILE}"

parser = et.XMLParser(remove_blank_text=True,strip_cdata=False)
tree = et.parse(item,parser)
root = tree.getroot()

#=========== Remove blocks/elements
#BAD for el in root.iter('cmt','desc','link','metadata'):
#BAD for el in root.iter('metadata','cmt','desc','link'):
#OK
for el in root.iter('cmt','desc','link'):
    parent = el.getparent()
    parent.remove(el)

#Write modified tree to new file
tree.write(OUTPUT_FULLPATH, xml_declaration=True,pretty_print=True,encoding='UTF-8')

_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com

Reply via email to