I don't know if it would really help, but you might try using CyberNeko [1] instead of JTidy. I've found it gives better results on average, particularly when dealing with [so-called] HTML pasted from Word.
Ugo
[1] http://www.apache.org/~andyc/neko/doc/html/
I must admit, that CyberNeko looks interesting :-)
Regards, Marcin Okraszewski
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]