Hi group, I just have another question in parsin XML files. I found it very easy to parse XML files with kent and danny's help.
I realized that all my XML files have '\t' and '\n' and whitespace. these extra features are making to extract the text data from the xml files very difficult. I can make these XML parser work when I rekove '\n' and '\t' from xml files. is there a way to get rid of '\n' and '\t' characters from xml files easily. thank you very much. MDan --- Kent Johnson <[EMAIL PROTECTED]> wrote: > ps python wrote: > > Kent and Dany, > > Thanks for your replies. > > > > Here fromstring() assuming that the input is in a > kind > > of text format. > > Right, that is for the sake of a simple example. > > > > what should be the case when I am reading files > > directly. > > > > I am using the following : > > > > from elementtree.ElementTree import ElementTree > > mydata = ElementTree(file='00001.xml') > > iter = root.getiterator() > > > > Here the whole XML document is loaded as element > tree > > and how should this iter into a format where I can > > apply findall() method. > > Call findall() directly on mydata, e.g. > for process in > mydata.findall('//biological_process'): > print process.text > > The path //biological_process means find any > biological_process element > at any depth from the root element. > > Kent > > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor