As I'm trying to write the code using cElementTree. I stumble across one problem. Sometimes there are multiple values to retrieve from one record for the same element. Like this: <Prot-ref_name_E>ATP-binding cassette, subfamily G, member 1</Prot-ref_name_E> <Prot-ref_name_E>ATP-binding cassette 8</Prot-ref_name_E>
How do you get not only the first, but the rest as well, so that I can store it in a list. Thanks in advance, Willem Ligtenberg On Fri, 22 Apr 2005 13:48:15 +0200, Willem Ligtenberg wrote: > This is all the info I need from the xml file: > ID --> <Gene-track_geneid>320632</Gene-track_geneid> > > Name --> <Gene-ref> > <Gene-ref_locus>Pzp</Gene-ref_locus> > > Startbase --> <Gene-commentary_seqs> > <Seq-loc> > <Seq-loc_int> > <Seq-interval> > <Seq-interval_from>126957426</Seq-interval_from> > <Seq-interval_to>126989473</Seq-interval_to> > <Seq-interval_strand> > <Na-strand value="plus"/> > </Seq-interval_strand> > <Seq-interval_id> > <Seq-id> > <Seq-id_gi>51860766</Seq-id_gi> > </Seq-id> > </Seq-interval_id> > </Seq-interval> > </Seq-loc_int> > </Seq-loc> > </Gene-commentary_seqs> > Endbase > > Function --> <Prot-ref_name> > <Prot-ref_name_E>U5 snRNP-specific protein, 200 kDa</Prot-ref_name_E> > <Prot-ref_name_E>U5 snRNP-specific protein, 200 kDa (DEXH RNA helicase > family)</Prot-ref_name_E> > </Prot-ref_name> > > DBLink --> <Gene-ref_locus-tag>MGI:2444401</Gene-ref_locus-tag> > <Gene-commentary_source> > <Other-source> > <Other-source_src> > <Dbtag> > <Dbtag_db>GO</Dbtag_db> > <Dbtag_tag> > <Object-id> > <Object-id_id>5524</Object-id_id> > </Object-id> > </Dbtag_tag> > </Dbtag> > </Other-source_src> > <Other-source_anchor>ATP binding</Other-source_anchor> > <Other-source_post-text>evidence: > ISS</Other-source_post-text> > </Other-source> > </Gene-commentary_source> > > Product-type --> <Entrezgene_type value="protein-coding">6</Entrezgene_type> > > gene-comment --> <Gene-ref_desc>activating signal cointegrator 1 complex > subunit 3-like > 1</Gene-ref_desc> > > synonym --> <Gene-ref_syn> > <Gene-ref_syn_E>HELIC2</Gene-ref_syn_E> > <Gene-ref_syn_E>KIAA0788</Gene-ref_syn_E> > <Gene-ref_syn_E>U5-200KD</Gene-ref_syn_E> > <Gene-ref_syn_E>U5-200-KD</Gene-ref_syn_E> > <Gene-ref_syn_E>A330064G03Rik</Gene-ref_syn_E> > </Gene-ref_syn> > > EC --> <Prot-ref_ec> > <Prot-ref_ec_E>1.5.1.5</Prot-ref_ec_E> > <Prot-ref_ec_E>3.5.4.9</Prot-ref_ec_E> > </Prot-ref_ec> > > Chromosome: <SubSource> > <SubSource_subtype value="chromosome">1</SubSource_subtype> > <SubSource_name>6</SubSource_name> > </SubSource> > > Some can happen more than once in a record. > > > On Fri, 22 Apr 2005 02:41:46 -0400, William Park wrote: > >> Willem Ligtenberg <[EMAIL PROTECTED]> wrote: >>> On Sun, 17 Apr 2005 02:16:04 +0000, William Park wrote: >>> > Care to post more details? >>> >>> The XML file I need to parse contains information about genes. >>> So the first element is a gene and then there are a lot sub-elements with >>> sub-elements. I only need some of the informtion and want to store it in >>> my an object called gene. Lateron this information will be printed into a >>> file, which in it's turn will be fed into some other program. >> >> You have to help us a little more here. Which info do you want to >> extract from below example? >> >>> <Entrezgene-Set> >>> ... >>> </Entrezgene-Set> -- http://mail.python.org/mailman/listinfo/python-list