Hi Dhananjay,

My requirement is simple. I need to extract information from a page. But the
pages can be malformed html or it can be any junk html. So the tolerance


On Thu, Sep 10, 2009 at 7:33 PM, Dhananjay Nene <dhananjay.n...@gmail.com>wrote:

> Do you require tolerance for non well formed xml / html ? If y, you may
> consider sgmlop http://effbot.org/zone/sgmlop-index.htm
> On Thu, Sep 10, 2009 at 7:07 PM, Baishampayan Ghose <b.gh...@gmail.com>wrote:
>> > Can anyone suggest me a good library for html parsing in python ?
>> > I googled a found few libararies BeautifulSoup, HTMLParser, SGMLParser
>> etc.
>> >
>> > Can anyone suggest me which should I go for from your experience.
>> BeautifulSoup was OK, but now it's broken. Use lxml, it's very good.
>> http://codespeak.net/lxml/
>> Regards,
>> BG
>> --
>> Baishampayan Ghose
>> b.ghose at gmail.com
>> _______________________________________________
>> BangPypers mailing list
>> BangPypers@python.org
>> http://mail.python.org/mailman/listinfo/bangpypers
> --
> --------------------------------------------------------
> blog: http://blog.dhananjaynene.com
> twitter: http://twitter.com/dnene
> _______________________________________________
> BangPypers mailing list
> BangPypers@python.org
> http://mail.python.org/mailman/listinfo/bangpypers
BangPypers mailing list

Reply via email to