Re: parsing 'html' documents using DOMParser

Andy Clark Mon, 27 Oct 2003 14:39:11 -0800

Mushfiqur Rahman wrote:

I want to parse a HTML document( may not be a XHTML document) using org.apache.xerces.parsers.DOMParser and get a org.w3c.dom.Document after parsing. Can anyone tell me how can I do it?


If you just need a DOM document, there are a
few options. Check out JTidy[1] and NekoHTML[2].
JTidy has been around longer but NekoHTML has
the advantage of using less memory and it is
built on top of Xerces.

But, as with everything, check out all of your
options and pick the one that works for you.

[1] http://sourceforge.net/projects/jtidy/
[2] http://www.apache.org/~andyc/neko/doc/html/

--
Andy Clark * [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: parsing 'html' documents using DOMParser

Reply via email to