> >>> for m in mydata.findall('//functions'): > print m.get('molecular_class').text > > >>> for m in mydata.findall('//functions'): > print m.find('molecular_class').text.strip() > > >>> for process in > mydata.findall('//biological_process'): > print process.get('title').text
Hello, I believe we're running into XML namespace issues. If we look at all the tag names in the XML, we can see this: ###### >>> from elementtree import ElementTree >>> tree = ElementTree.parse(open('00004.xml')) >>> for element in tree.getroot()[0]: print element.tag ... {org:hprd:dtd:hprdr2}title {org:hprd:dtd:hprdr2}alt_title {org:hprd:dtd:hprdr2}alt_title {org:hprd:dtd:hprdr2}alt_title {org:hprd:dtd:hprdr2}alt_title {org:hprd:dtd:hprdr2}alt_title {org:hprd:dtd:hprdr2}omim {org:hprd:dtd:hprdr2}gene_symbol {org:hprd:dtd:hprdr2}gene_map_locus {org:hprd:dtd:hprdr2}seq_entry {org:hprd:dtd:hprdr2}molecular_weight {org:hprd:dtd:hprdr2}entry_sequence {org:hprd:dtd:hprdr2}protein_domain_architecture {org:hprd:dtd:hprdr2}expressions {org:hprd:dtd:hprdr2}functions {org:hprd:dtd:hprdr2}cellular_component {org:hprd:dtd:hprdr2}interactions {org:hprd:dtd:hprdr2}EXTERNAL_LINKS {org:hprd:dtd:hprdr2}author {org:hprd:dtd:hprdr2}last_updated ###### (I'm just doing a quick view of the toplevel elements in the tree.) As we can see, each element's tag is being prefixed with the namespace URL provided in the XML document. If we look in our XML document and search for the attribute 'xmlns', we'll see where this 'org:hprd:dtd:hprdr2' thing comes from. So we may need to prepend the namespace to get the proper terms: ###### >>> for process in tree.find("//{org:hprd:dtd:hprdr2}biological_processes"): ... print process.findtext("{org:hprd:dtd:hprdr2}title") ... Metabolism Energy pathways ###### To tell the truth, I don't quite understand how to work fluently with XML namespaces, so perhaps there's an easier way to do what you want. But the examples above should help you get started parsing all your Gene Ontology annotations. Good luck! _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor