bruce wrote: > The following text contains sample data. I'm simply trying to parse it > using libxml2dom as the lib to extract data. > > As an example, to get the name/desc > > test data > <class_meta_data><departments><department><name><![CDATA[A > HTG]]></name><desc><![CDATA[American > Heritage]]></desc></department><department><name><! [CDATA[ACC]]></name><desc><![CDATA[Accounting]]></desc></department> > > d = libxml2dom.parseString(s, html=1) > > p1="//department/name" > p2="//department/desc" > > pcount_ = d.xpath(p1) > p2_ = d.xpath(p2) > print str(len(pcount_)) > nba=0 > > for a in pcount_: > abbrv=a.nodeValue > print abbrv > abbrv=a.toString() > print abbrv > abbrv=a.textContent > print abbrv > > neither of the above generates any of the CML name/desc data.. > > any pointers on what I'm missing???
Your example seems to work here when I omit the html=1 d = libxml2dom.parseString(s) ... > I can/have created a quick parse/split process to get the data, but I > thought there'd be a straight forward process to extract the data > using one of the py/libs.. One way using the stdlib: from xml.etree import ElementTree as ET #root = ET.parse(filename).getroot() root = ET.fromstring(data) for department in root.findall(".//department"): name = department.find("name").text desc = department.find("desc").text print("{}: {}".format(name, desc)) _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor