Re: [Tutor] Python XML for newbie
Sean Carolan wrote: Thank you, this is helpful. Minidom is confusing, even the documentation confirms this: The name of the functions are perhaps misleading But I'd start with the etree tutorial (of which there are many variations on the web): Ok, so I read through these tutorials and am at least able to print the XML output now. I did this: doc = etree.parse('computer_books.xml') and then this: for elem in doc.iter(): print elem.tag, elem.text Here's the data I'm interested in: index 1 field 11 value 9780596526740 datum How do you say, If the field is 11, then print the next value? The raw XML looks like this: datum index1/index field11/field value9780470286975/value /datum Basically I just want to pull all these ISBN numbers from the file. With http://lxml.de/ you can use xpath: $ cat computer_books.xml foo bar datum index1/index field11/field value9780470286975/value /datum /bar /foo $ cat read_isbn.py from lxml import etree root = etree.parse(computer_books.xml) print root.xpath(//datum[field=11]/value/text()) $ python read_isbn.py ['9780470286975'] $ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python XML for newbie
Peter Otten, 02.07.2012 09:57: Sean Carolan wrote: Thank you, this is helpful. Minidom is confusing, even the documentation confirms this: The name of the functions are perhaps misleading Yes, I personally think that (Mini)DOM should be locked away from beginners as far as possible. Ok, so I read through these tutorials and am at least able to print the XML output now. I did this: doc = etree.parse('computer_books.xml') and then this: for elem in doc.iter(): print elem.tag, elem.text Here's the data I'm interested in: index 1 field 11 value 9780596526740 datum How do you say, If the field is 11, then print the next value? The raw XML looks like this: datum index1/index field11/field value9780470286975/value /datum Basically I just want to pull all these ISBN numbers from the file. With http://lxml.de/ you can use xpath: $ cat computer_books.xml foo bar datum index1/index field11/field value9780470286975/value /datum /bar /foo $ cat read_isbn.py from lxml import etree root = etree.parse(computer_books.xml) print root.xpath(//datum[field=11]/value/text()) $ python read_isbn.py ['9780470286975'] $ And lxml.objectify is also a nice tool for this: $ cat example.xml items item id108/id data datum index1/index field2/field valueEssential System Administration/value /datum /data /item /items $ python Python 2.7.3 from lxml import objectify t = objectify.parse('example.xml') for datum in t.iter('datum'): ... if datum.field == 2: ... print(datum.value) ... Essential System Administration It's not impossible that this is faster than the XPath version, but that depends a lot on the data. Stefan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python XML for newbie
Yes, I personally think that (Mini)DOM should be locked away from beginners as far as possible. Ok, I'm glad to hear that. I'll continue to work with ElementTree and lxml and see where it takes me. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Python XML for newbie
I'm trying to parse some XML data (Book titles, ISBN numbers and descriptions) with Python. Is there a *simple* way to import an XML file into a dictionary, list, or other usable data structure? I've poked around with minidom, elementtree, and untangle but am not really understanding how they are supposed to work. Here's some sample data: xml fields field nameTitle/name id2/id count1/count type11/type searchtrue/search hasnumberfalse/hasnumber /field ...several more fields, then there are the items... /fields items item id108/id data datum index1/index field2/field valueEssential System Administration/value /datum For starters, I'd like to be able to just print out the list of titles in the XML file, using the correct XML parser. I don't mind doing some research or reading on my own, but the official documentation seems terribly confusing to me. http://docs.python.org/library/xml.dom.minidom.html Any pointers? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python XML for newbie
On 01/07/12 21:49, Sean Carolan wrote: ... Is there a *simple* way to import an XML file into a dictionary, list, or other usable data structure? The simplest way using the standard library tools is (IMHO) elementtree. minidom is a complex beast by comparison, especially if you are not intimately familiar with your XML structure. However hthere are some other add-in packages that are allegedly much easier still. But I'd start with the etree tutorial (of which there are many variations on the web): The original: http://effbot.org/zone/element-index.htm My preference: http://infohost.nmt.edu/tcc/help/pubs/pylxml/web/index.html You may not need anything else... -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python XML for newbie
The simplest way using the standard library tools is (IMHO) elementtree. minidom is a complex beast by comparison, especially if you are not intimately familiar with your XML structure. Thank you, this is helpful. Minidom is confusing, even the documentation confirms this: The name of the functions are perhaps misleading But I'd start with the etree tutorial (of which there are many variations on the web): The original: http://effbot.org/zone/element-index.htm My preference: http://infohost.nmt.edu/tcc/help/pubs/pylxml/web/index.html I'm going to work through those and see what I can come up with. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python XML for newbie
Thank you, this is helpful. Minidom is confusing, even the documentation confirms this: The name of the functions are perhaps misleading But I'd start with the etree tutorial (of which there are many variations on the web): Ok, so I read through these tutorials and am at least able to print the XML output now. I did this: doc = etree.parse('computer_books.xml') and then this: for elem in doc.iter(): print elem.tag, elem.text Here's the data I'm interested in: index 1 field 11 value 9780596526740 datum How do you say, If the field is 11, then print the next value? The raw XML looks like this: datum index1/index field11/field value9780470286975/value /datum Basically I just want to pull all these ISBN numbers from the file. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python XML for newbie
On Mon, Jul 2, 2012 at 12:31 PM, Sean Carolan scaro...@gmail.com wrote: How do you say, If the field is 11, then print the next value? The raw XML looks like this: datum index1/index field11/field value9780470286975/value /datum Instead of iterating over the whole tree, grab all the datum elements then retrieve the field child, check the field value, and if '11', then pull the value value. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor