Hi Dhananjay, My requirement is simple. I need to extract information from a page. But the pages can be malformed html or it can be any junk html. So the tolerance required.
Thanks, Puneet On Thu, Sep 10, 2009 at 7:33 PM, Dhananjay Nene <dhananjay.n...@gmail.com>wrote: > Do you require tolerance for non well formed xml / html ? If y, you may > consider sgmlop http://effbot.org/zone/sgmlop-index.htm > > > On Thu, Sep 10, 2009 at 7:07 PM, Baishampayan Ghose <b.gh...@gmail.com>wrote: > >> > Can anyone suggest me a good library for html parsing in python ? >> > I googled a found few libararies BeautifulSoup, HTMLParser, SGMLParser >> etc. >> > >> > Can anyone suggest me which should I go for from your experience. >> >> BeautifulSoup was OK, but now it's broken. Use lxml, it's very good. >> >> http://codespeak.net/lxml/ >> >> Regards, >> BG >> >> >> -- >> Baishampayan Ghose >> b.ghose at gmail.com >> _______________________________________________ >> BangPypers mailing list >> BangPypers@python.org >> http://mail.python.org/mailman/listinfo/bangpypers >> > > > > -- > -------------------------------------------------------- > blog: http://blog.dhananjaynene.com > twitter: http://twitter.com/dnene > > _______________________________________________ > BangPypers mailing list > BangPypers@python.org > http://mail.python.org/mailman/listinfo/bangpypers > >
_______________________________________________ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers