I use nekoHTML for parsing HTML and building DOM from HTML input
instead of JTidy for about six months.
It allows to set a chain of filters which are performed on the
document after parsing. One of the filters is the filter
"ElementRemover" that removes from document or keeps  elements
specified.

>> I don't know if it would really help, but you might try using CyberNeko 
>> [1] instead of JTidy. I've found it gives better results on average, 
>> particularly when dealing with [so-called] HTML pasted from Word.
>> 
>>     Ugo
>> 
>> 
>> [1] http://www.apache.org/~andyc/neko/doc/html/

MO> I must admit, that CyberNeko looks interesting :-)

MO> Regards,
MO> Marcin Okraszewski

MO> ---------------------------------------------------------------------
MO> To unsubscribe, e-mail: [EMAIL PROTECTED]
MO> For additional commands, e-mail: [EMAIL PROTECTED]

-- 
Best regards,
Peter Velychko                            
[EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to