I use nekoHTML for parsing HTML and building DOM from HTML input instead of JTidy for about six months. It allows to set a chain of filters which are performed on the document after parsing. One of the filters is the filter "ElementRemover" that removes from document or keeps elements specified.
>> I don't know if it would really help, but you might try using CyberNeko >> [1] instead of JTidy. I've found it gives better results on average, >> particularly when dealing with [so-called] HTML pasted from Word. >> >> Ugo >> >> >> [1] http://www.apache.org/~andyc/neko/doc/html/ MO> I must admit, that CyberNeko looks interesting :-) MO> Regards, MO> Marcin Okraszewski MO> --------------------------------------------------------------------- MO> To unsubscribe, e-mail: [EMAIL PROTECTED] MO> For additional commands, e-mail: [EMAIL PROTECTED] -- Best regards, Peter Velychko [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]