I've had fairly good experience with Jtidy!

But HTMLParser http://htmlparser.sourceforge.net/
seems to have the lighter looking API. It is Event
based and I might need to parse some large HTML sometime
soon, where DOM might be the problem. Does anyone
have practical experience with HTMLParser?

Thanks
Frank

> -----Ursprüngliche Nachricht-----
> Von: petite_abeille [mailto:[EMAIL PROTECTED] 
> Gesendet: Dienstag, 25. Februar 2003 19:49
> An: Lucene Users List
> Betreff: Re: Best HTML Parser !!
> 
> 
> 
> On Monday, Feb 24, 2003, at 20:28 Europe/Zurich, Lukas Zapletal wrote:
> 
> > I have some good experiences with JTidy. It works like 
> DOM-XML parser
> > and cleans HTML it by the way.
> 
> I use jtidy also. Both for parsing and clean-up. Works pretty nicely.
> 
> > This is VERY useful, because EVERY HTML have at least ONE error.
> 
> This rule should be tattooed on every parsers head: out of the 
> laboratory, nothing is compliant. Which render the race to "more 
> compliance" among the different parsers somewhat ridiculous.
> 
> Cheers,
> 
> PA.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to