Google is your friend. Elementtree is one of the better documented IMHO, but there are many modules to do this.
> -----Original Message----- > From: python-list-bounces+frsells=adventistcare....@python.org > [mailto:python-list-bounces+frsells=adventistcare....@python.org] On > Behalf Of Stefan Behnel > Sent: Friday, January 29, 2010 2:25 PM > To: python-list@python.org > Subject: Re: Processing XML File > > jakecjacobson, 29.01.2010 18:25: > > I need to take a XML web resource and split it up into smaller XML > > files. I am able to retrieve the web resource but I can't find any > > good XML examples. I am just learning Python so forgive me if this > > question has been answered many times in the past. > > > > My resource is like: > > > > <document> > > ... > > ... > > </document> > > <document> > > ... > > ... > > </document> > > Is this what you get as a document or is this just /contained/ in the > document? > > Note that XML does not allow more than one root element, so the above is > not XML. Each of the two <document>...</document> parts form an XML > document by themselves, though. > > > > So in this example, I would need to output 2 files with the contents > > of each file what is between the open and close document tag. > > Are the two files formatted as you show above? In that case, you can > simply > iterate over the lines and cut the document when you see "<document>". Or, > if you are sure that "<document>" only appears as top-most elements and > not > inside of the documents, you can search for "<document>" in the content (a > string, I guess) and split it there. > > As was pointed out before, once you have these two documents, use the > xml.etree package to work with them. > > Something like this might work: > > import xml.etree.ElementTree as ET > > data = urllib2.urlopen(url).read() > > for part in data.split('<document>'): > document = ET.fromstring('<document>'+part) > print(document.tag) > # ... do other stuff > > Stefan > -- > http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list