You are on the right track, just create another nested child loop and look for 
the items in there.   You didn't say what you want to do when you find the 
children, but this will at least allow you to find specific elements using your 
technique.

john

-----Original Message-----
From: Gaurav Kumar [mailto:[email protected]] 
Sent: Monday, March 22, 2010 8:18 PM
To: [email protected]
Subject: Parsing Nested XML tags in Xercers-C


 Hi,

 I'm new to Xerces-C and not sure of many concepts within this API. I
 though to learn this useful API by following tutorials and problems
 discussed in the mailing list.

 I'm able to extract attributes of the tag <LOCATE_protein>. This tag
 contains nested children. I need to traverse through the XML tree to fetch
 the required information in the nested tags( from child or grandchild
 nodes). Can any one suggest any simple function to do that in Xerces-C.
 Below
  is the sample XML file and modified code
 (http://www.yolinux.com/TUTORIALS/XML-Xerces-C.html).

 Thanks in advance

 Cheers
 Gaurav
 <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
 <LOCATE_interaction xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
      <LOCATE_protein uid="6000002" uniprot="P27824" refseq="">
           <externalannot>
                <source db="HPRD" db_id="00252"
 goid="GO:0005764">Lysosomes</source>
                <source db="HPRD" db_id="00252" goid="GO:0005635">Nuclear
 Envelope</source>
                <source db="HPRD" db_id="00252" goid="GO:0005794">Golgi
 Apparatus</source>
                <source db="HPRD" db_id="00252"
 goid="GO:0005783">Endoplasmic Reticulum</source>
               <source db="HPRD" db_id="00252" goid="GO:0005886">Plasma
 Membrane</source>
                <source db="UniProt/SPTrEMBL" db_id="P27824"
 goid="GO:0005783">endoplasmic reticulum</source>
                <source db="UniProt/SPTrEMBL" db_id="P27824"
 goid="GO:0042470">melanosome</source>
           </externalannot>
           <literature></literature>
           <direct_interaction>
                <entry source="HPRD" source_id="00252" uniprot="P27824"
 refseq="NP_001737.1">
                     <name>Calnexin</name>
                     <interactor type="direct" pubmed_id="8136357">
                          <molecule source_id="00127" gene_symbol="IFNGR1"
 uniprot="P15260" refseq="">Interferon gamma
 receptor 1</molecule>
                     </interactor>
        </direct_interaction>
        <metabolic_interaction>
                <entry source_id="hsa:55832">
                     <gene_name>CAND1</gene_name>
                     <defination>cullin-associated and
 neddylation-dissociated 1</defination>
                     <orthology></orthology>
                     <class></class>
                     <enzyme></enzyme>
                </entry>
               <entry source_id="ENSG00000111530-MONOMER"></entry>
           </metabolic_interaction>
 </LOCATE_protein>
        ....
        ....
        .....
 </LOCATE_interaction>



  m_ConfigFileParser->parse( configFile.c_str() );

       DOMDocument* xmlDoc = m_ConfigFileParser->getDocument();

       DOMElement* elementRoot = xmlDoc->getDocumentElement();
       if( !elementRoot ) throw(std::runtime_error( "empty XML document"
 ));
     DOMNodeList*      children = elementRoot->getChildNodes();

       cout << "Total Locates Proteins : " << children->getLength() <<
 endl;

      for( XMLSize_t xx = 0; xx < children->getLength(); ++xx )
       {
          DOMNode* currentNode = children->item(xx);
          if( currentNode->getNodeType() == DOMNode::ELEMENT_NODE )
          {
             // Found node which is an Element. Re-cast node as element
             DOMElement* currentElement
                         = dynamic_cast< xercesc::DOMElement* >(
 currentNode );
            //cout << currentElement << endl;
             if(
 XMLString::equals(currentElement->getTagName(),TAG_locateProtein))
             {
                // Already tested node as type element and of name
 "ApplicationSettings".
                // Read attributes of element "ApplicationSettings".
              const XMLCh* xmlch_locateID
                     = currentElement->getAttribute(ATTR_locateID);
               m_locateID = XMLString::transcode(xmlch_locateID);

              const XMLCh* xmlch_locateUniprotID
                    = currentElement->getAttribute(ATTR_locateUniprotID);
              m_locateUniprotID = XMLString::transcode(xmlch_locateUniprotID);

              const XMLCh* xmlch_locateRefseqID
                    = currentElement->getAttribute(ATTR_locateRefseqID);
              m_locateRefseqID = XMLString::transcode(xmlch_locateRefseqID);

              cout << "Locate ID:"
                   << m_locateID
                   << "|UniprotID:"
                   << m_locateUniprotID
                   << "|RefseqID:"
                   << m_locateRefseqID
                   << endl;

              DOMNode* currentChild=currentNode->getFirstChild();
              cout << currentChild->getTextContent() << endl;
              cout <<
 XMLString::transcode(currentNode->getFirstChild()->getNodeName())
                   << endl;
        }
     }
 }



-- 
Mr. Gaurav Kumar
PhD Student (Bioinformatics/Computational Biology)

Reply via email to