[EMAIL PROTECTED] wrote:
>
> Hiya,
>
> I am trying to parse some very ugly HTML, complete with missing closing
> tags and non quoted attributes.
>
> I have noticed a few HTML packages in xerces, but I can't make sens of
> them.
>
> Is there a class that I could use to parse HTML and build a proper DOM out
> of it?
Sounds like you want an HTML parser which converts to XHTML. Try Tidy
at http://www.w3.org/People/Raggett/tidy/
-Edwin
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]