On Fri, 2010-01-29 at 10:34 -0800, jakecjacobson wrote: > On Jan 29, 1:04 pm, Adam Tauno Williams <awill...@opengroupware.us> > wrote: > > On Fri, 2010-01-29 at 09:25 -0800, jakecjacobson wrote: > > > I need to take a XML web resource and split it up into smaller XML > > > files. I am able to retrieve the web resource but I can't find any > > > good XML examples. I am just learning Python so forgive me if this > > > question has been answered many times in the past. > > > My resource is like: > > > <document> > > > ... > > > ... > > > </document> > > > <document> > > > </document> > > > So in this example, I would need to output 2 files with the contents > > > of each file what is between the open and close document tag. > > Do you want to parse the document or SaX? > > I have a SaX example at > > <http://coils.hg.sourceforge.net/hgweb/coils/coils/file/99b227b08f7f/s...> > Thanks but I am way over my head with XML, Python. I am working with > DDMS and need to output the individual resource nodes to their own > file. I hope that this helps and I need a good example and how to use > it.
If that is all you need XPath will spit it apart for you like <http://coils.hg.sourceforge.net/hgweb/coils/coils/file/99b227b08f7f/src/coils/logic/workflow/actions/xml/xpath.py> doc = etree.parse(self._rfile) results = doc.xpath(xpath) for result in results: print str(result) For example if your XML has an outermost element of ResultSet with inner row elements just do: for record in doc.xpath(u'/ResultSet/row') Implied import for these examples is "from lxml import etree" > Here is what a resource node looks like: > <ddms:Resource > xsi:schemaLocation="https://metadata.dod.mil/mdr/ns/DDMS/1.4/ > https://metadata.dod.mil/mdr/ns/DDMS/1.4/" > xmlns:ddms="https://metadata.dod.mil/mdr/ns/DDMS/1.4/" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xmlns:ICISM="urn:us:gov:ic:ism:v2"> > <ddms:identifier ddms:qualifier="URL" ddms:value="https:// > metadata.dod.mil/mdr/ns/TBD/1.0/SampleTaxonomy.owl"/> > <ddms:identifier ddms:qualifier="https://metadata.dod.mil/mdr/ > ns/MDR/1.0/MDR.owl#GovernanceNamespace" ddms:value="TBD"/> > <ddms:identifier ddms:qualifier="Version" ddms:value="1.0"/> > <ddms:title ICISM:ownerProducer="USA" > ICISM:classification="U">Sample Taxonomy</ddms:title> > <ddms:description ICISM:ownerProducer="USA" > ICISM:classification="U"> > This is a sample taxonomy created for the Help page. > </ddms:description> > <ddms:dates ddms:posted="2007-11-24"/> > <ddms:creator ICISM:ownerProducer="USA" > ICISM:classification="U"> > <ddms:Person> > <ddms:name>Sample</ddms:name> > <ddms:surname>Developer</ddms:surname> > <ddms:affiliation>FGM, Inc.</ddms:affiliation> > <ddms:phone>703-885-1000</ddms:phone> > <ddms:email>sampledevelo...@fgm.com</ddms:email> > </ddms:Person> > </ddms:creator> > <ddms:security ICISM:ownerProducer="USA" > ICISM:classification="U" ICISM:nonICmarkings="DIST_STMT_A" /> > <!-- Other DDMS elements may appear here. --> > </ddms:Resource> > > You can see the DDMS site at https://metadata.dod.mil/. -- OpenGroupware developer: awill...@whitemice.org <http://whitemiceconsulting.blogspot.com/> OpenGroupare & Cyrus IMAPd documenation @ <http://docs.opengroupware.org/Members/whitemice/wmogag/file_view> -- http://mail.python.org/mailman/listinfo/python-list