I'm using JTidy to convert a string containing some HTML to XHTML in a DOM
tree. I can't get the foreign characters like ��� converted to the XHTML
counterpart. What setting do I need to use???
Here's a code snip from my XSP page:
String strContent = request.getParameter("content");
ByteArrayInputStream in = new ByteArrayInputStream(
strContent.getBytes() );
String strOut = "";
org.w3c.dom.Document doc = null;
org.w3c.tidy.Configuration conf = new org.w3c.tidy.Configuration();
try {
Tidy tidy = new Tidy();
//create output as XML
tidy.setXmlOut(true);
//output should be XHTML conforming
tidy.setXHTML(true);
tidy.setBreakBeforeBR(false);
tidy.setRawOut(false);
tidy.setCharEncoding( conf.UTF8 );
//do not output 'non-breaking space' as entity.
tidy.setQuoteNbsp(true);
//output naked ampersand as &
tidy.setQuoteAmpersand(true);
//drop presentation tags
tidy.setLiteralAttribs(true);
//parse the stream to a DOM document
doc = tidy.parseDOM(in, null);
} catch (Exception e) {
}
Bert
*Friends Are Angels Who Lift Us To Our Feet When Our Wings Have Trouble
Remembering How To Fly*
---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>