> I've tried jTidy, but it seems to choke if the HTML document it receives is 
> not well-formed...

Matthew: jTidy will handle ill-formed documents... JournURL uses it
pretty much constantly, anywhere HTML is involved. The problem is
puzzling out which of the gazillion methods you need to call to get
the results you're after.

Suggestions:

jTidy.setNumEntities(true);
jTidy.setXHTML(true);
jTidy.setXmlOut(true);
jTidy.setForceOutput(true);

That last one is crucial, and caused me weeks of headaches before I
finally figured out what was happening.

--
Roger Benningfield
JournURL
http://admin.support.journurl.com/
http://admin.mxblogspace.journurl.com/

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:199092
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54

Reply via email to