> I've tried jTidy, but it seems to choke if the HTML document it receives is > not well-formed...
Matthew: jTidy will handle ill-formed documents... JournURL uses it pretty much constantly, anywhere HTML is involved. The problem is puzzling out which of the gazillion methods you need to call to get the results you're after. Suggestions: jTidy.setNumEntities(true); jTidy.setXHTML(true); jTidy.setXmlOut(true); jTidy.setForceOutput(true); That last one is crucial, and caused me weeks of headaches before I finally figured out what was happening. -- Roger Benningfield JournURL http://admin.support.journurl.com/ http://admin.mxblogspace.journurl.com/ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:199092 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations & Support: http://www.houseoffusion.com/tiny.cfm/54