Hey; Reasonably new to python and incredibly new to xml much less trying to parse it. I need to identify cluster nodes from a series of weblogic xml configuration files. I've figured out how to get 75% of them; now, I'm going after the edge case and I'm unsure how to proceed.
Weblogic xml config files start with namespace definitions then a number of child elements some of which have children of their own. The element that I'm interested in is <server> which will usually have a subelement called <listen-address> containing the hostname that I'm looking for. Following the paradigm of "we love standards, we got lots of them", this model doesn't work everywhere. Where it doesn't work, I need to look for a subelement of <server> called <machine>. That element contains an alias which is expanded in a different root child, at the same level as <server>. So, picture worth a 1000 words: <?xml version='1.0' encoding='UTF-8'?> < [[ heinous namespace xml snipped ]] > <name>[[text]]</name> ... <server> <name>EDIServices_MS1</name> ... <machine>EDIServices_MC1</machine> ... </server> <server> <name>EDIServices_MS2</name> ... <machine>EDIServices_MC2</machine> ... </server> <machine xsi:type="unix-machineType"> <name>EDIServices_MC1</name> <node-manager> <name>EDIServices_MC1</name> <nm-type>SSL</nm-type> <listen-address>host001</listen-address> <listen-port>7001</listen-port> </node-manager> </machine> <machine xsi:type="unix-machineType"> <name>EDIServices_MC2</name> <node-manager> <name>EDIServices_MC2</name> <listen-address>host002</listen-address> <listen-port>7001</listen-port> </node-manager> </machine> </domain> So, running it on 'normal' config, I get: $ ./lxml configs/EntsvcSoa_Domain_config.xml EntsvcSoa_CS => host003.myco.com EntsvcSoa_CS => host004.myco.com Running it against the abi-normal config, I'm currently getting: $ ./lxml configs/EDIServices_Domain_config.xml EDIServices_CS => EDIServices_MC1 EDIServices_CS => EDIServices_MC2 Using the examples above, I would like to translate EDIServices_MC1 and EDIServices_MC2 to host001 and host002 respectively. The primary loop is: for server in root.findall('ns:server', namespaces): cs = server.find('ns:cluster', namespaces) if cs is None: continue # cluster_name = server.find('ns:cluster', namespaces).text cluster_name = cs.text listen_address = server.find('ns:listen-address', namespaces) server_name = listen_address.text if server_name is None: machine = server.find('ns:machine', namespaces) if machine is None: continue else: server_name = machine.text print("%-15s => %s" % (cluster_name, server_name)) (it's taken me days to write 12 lines of code... good thing I don't do this for a living :) ) Rephrased, I need to find the <listen-address> under the <machine> child who's name matches the name under the corresponding <server> child. From some of the examples on the web, I believe xpath might help but I've not been able to get even the simple examples working. Go figure, I just figured out what a namespace is... Any hints/tips/suggestions greatly appreciated especially with complete noob tutorials for xpath. Thanks for your time. Doug O'Leary -- https://mail.python.org/mailman/listinfo/python-list