On Friday, 23 March 2012 13:52:05 UTC, Sangeet wrote: > Hi, > > I've got to fetch data from the snippet below and have been trying to match > the digits in this to specifically to specific groups. But I can't seem to > figure how to go about stripping the tags! :( > > <tr><td align="center"><b>Sum</b></td><td></td><td align='center' > class="green">245</td><td align='center' class="red">11</td><td > align='center'>0</td><td align='center' >256</td><td align='center' >1.496 > [min]</td></tr> > </table> > > Actually, I'm working on ROBOT Framework, and haven't been able to figure out > how to read data from HTML tables. Reading from the source, is the best (read > rudimentary) way I could come up with. Any suggestions are welcome! > > Thanks, > Sangeet
I would personally use lxml - a quick example: # -*- coding: utf-8 -*- import lxml.html text = """ <tr><td align="center"><b>Sum</b></td><td></td><td align='center' class="green">245</td><td align='center' class="red">11</td><td align='center'>0</td><td align='center' >256</td><td align='center' >1.496 [min]</td></tr> </table> """ table = lxml.html.fromstring(text) for tr in table.xpath('//tr'): print [ (el.get('class', ''), el.text_content()) for el in tr.iterfind('td') ] [('', 'Sum'), ('', ''), ('green', '245'), ('red', '11'), ('', '0'), ('', '256'), ('', '1.496 [min]')] It does a reasonable job, but if it doesn't work quite right, then there's a .fromstring(parser=...) option, and you should be able to pass in ElementSoup and try your luck from there. hth, Jon. -- http://mail.python.org/mailman/listinfo/python-list