Re: Finding all instances of a string in an XML file
xml = ?xml version=1.0 encoding=UTF-8? !DOCTYPE KMART SYSTEM my.dtd LEVEL_1 LEVEL_2 ATTR=hello ATTRIBUTE NAME=Property X VALUE =2/ /LEVEL_2 LEVEL_2 ATTR=goodbye ATTRIBUTE NAME=Property Y VALUE =NULL/ LEVEL_3 ATTR=aloha ATTRIBUTE NAME=Property X VALUE =3/ /LEVEL_3 ATTRIBUTE NAME=Property Z VALUE =welcome/ /LEVEL_2 /LEVEL_1 import xml.etree.ElementTree as etree tree = etree.fromstring(xml) def walk(elem, path, token): path += (elem,) if token in elem.attrib.values(): yield path for child in elem.getchildren(): for match in walk(child, path, token): yield match for path in walk(tree, (), Property X): print(, .join({} {}.format(elem.tag, elem.attrib) for elem in path)) Peter, thank you, that exactly meets my need. -- http://mail.python.org/mailman/listinfo/python-list
Re: Finding all instances of a string in an XML file
Jason Friedman wrote: I have XML which looks like: ?xml version=1.0 encoding=UTF-8? !DOCTYPE KMART SYSTEM my.dtd LEVEL_1 LEVEL_2 ATTR=hello ATTRIBUTE NAME=Property X VALUE =2/ /LEVEL_2 LEVEL_2 ATTR=goodbye ATTRIBUTE NAME=Property Y VALUE =NULL/ LEVEL_3 ATTR=aloha ATTRIBUTE NAME=Property X VALUE =3/ /LEVEL_3 ATTRIBUTE NAME=Property Z VALUE =welcome/ /LEVEL_2 /LEVEL_1 The Property X string appears twice times and I want to output the path that leads to all such appearances. In this case the output would be: LEVEL_1 {}, LEVEL_2 {ATTR: hello}, ATTRIBUTE {NAME: Property X, VALUE: 2} LEVEL_1 {}, LEVEL_2 {ATTR: goodbye}, LEVEL_3 {ATTR: aloha}, ATTRIBUTE {NAME: Property X, VALUE: 3} My actual XML file is 2000 lines and contains up to 8 levels of nesting. That's still small, so xml = ?xml version=1.0 encoding=UTF-8? !DOCTYPE KMART SYSTEM my.dtd LEVEL_1 LEVEL_2 ATTR=hello ATTRIBUTE NAME=Property X VALUE =2/ /LEVEL_2 LEVEL_2 ATTR=goodbye ATTRIBUTE NAME=Property Y VALUE =NULL/ LEVEL_3 ATTR=aloha ATTRIBUTE NAME=Property X VALUE =3/ /LEVEL_3 ATTRIBUTE NAME=Property Z VALUE =welcome/ /LEVEL_2 /LEVEL_1 import xml.etree.ElementTree as etree tree = etree.fromstring(xml) def walk(elem, path, token): path += (elem,) if token in elem.attrib.values(): yield path for child in elem.getchildren(): for match in walk(child, path, token): yield match for path in walk(tree, (), Property X): print(, .join({} {}.format(elem.tag, elem.attrib) for elem in path)) -- http://mail.python.org/mailman/listinfo/python-list
Re: Finding all instances of a string in an XML file
Jason Friedman jsf80...@gmail.com writes: I have XML which looks like: ?xml version=1.0 encoding=UTF-8? !DOCTYPE KMART SYSTEM my.dtd LEVEL_1 LEVEL_2 ATTR=hello ATTRIBUTE NAME=Property X VALUE =2/ /LEVEL_2 LEVEL_2 ATTR=goodbye ATTRIBUTE NAME=Property Y VALUE =NULL/ LEVEL_3 ATTR=aloha ATTRIBUTE NAME=Property X VALUE =3/ /LEVEL_3 ATTRIBUTE NAME=Property Z VALUE =welcome/ /LEVEL_2 /LEVEL_1 The Property X string appears twice times and I want to output the path that leads to all such appearances. You could use lxml and its xpath support. This is a high end approach: you would use a powerful (and big) infrastructure (but one which could also be of use for other XML applications). There are more elementary approaches as well (e.g. parse the XML into a DOM and provide your own visitor to find the elements you are interested in). -- http://mail.python.org/mailman/listinfo/python-list
Re: Finding all instances of a string in an XML file
Thank you Peter and Dieter, will give those thoughts a try and report back. -- http://mail.python.org/mailman/listinfo/python-list
Finding all instances of a string in an XML file
I have XML which looks like: ?xml version=1.0 encoding=UTF-8? !DOCTYPE KMART SYSTEM my.dtd LEVEL_1 LEVEL_2 ATTR=hello ATTRIBUTE NAME=Property X VALUE =2/ /LEVEL_2 LEVEL_2 ATTR=goodbye ATTRIBUTE NAME=Property Y VALUE =NULL/ LEVEL_3 ATTR=aloha ATTRIBUTE NAME=Property X VALUE =3/ /LEVEL_3 ATTRIBUTE NAME=Property Z VALUE =welcome/ /LEVEL_2 /LEVEL_1 The Property X string appears twice times and I want to output the path that leads to all such appearances. In this case the output would be: LEVEL_1 {}, LEVEL_2 {ATTR: hello}, ATTRIBUTE {NAME: Property X, VALUE: 2} LEVEL_1 {}, LEVEL_2 {ATTR: goodbye}, LEVEL_3 {ATTR: aloha}, ATTRIBUTE {NAME: Property X, VALUE: 3} My actual XML file is 2000 lines and contains up to 8 levels of nesting. I have tried this so far (partial code, using the xml.etree.ElementTree module): def get_path(data_dictionary, val, path): for node in data_dictionary[CHILDREN]: if node[CHILDREN]: if not path or node[TAG] != path[-1]: path.append(node[TAG]) print(CR + recursing ...) get_path(node, val, path) else: for k,v in node[ATTRIB].items(): if v == val: print(path- ,path) print( + node[TAG] + + str(node[ATTRIB])) I'm really not even close to getting the output I am looking for. Python 3.2.2. Thank you. -- http://mail.python.org/mailman/listinfo/python-list