On Apr 13, 2:12 pm, Chris Colbert <[email protected]> wrote: > On Tue, Apr 13, 2010 at 1:58 PM, varnikat t <[email protected]> wrote: > > > Hi, > > Can anyone tell me how to get text from a html file?I am trying to display > > the text of an html file in textview(of glade).If i directly display the > > file,it shows with html tags and attributes, etc. in textview.I don't want > > that.I just want the text. > > Can someone help me with this? > > > Regards > > Varnika Tewari > > > -- > >http://mail.python.org/mailman/listinfo/python-list > > You should look into beautiful soup > > http://www.crummy.com/software/BeautifulSoup/
For more complex parsing beautiful soup is definitely the way to go. However, if all you want to do is strip the html and keep all remaining text I'd recommend pyparsing package with this short script: http://pyparsing.wikispaces.com/file/view/htmlStripper.py -- http://mail.python.org/mailman/listinfo/python-list
