Hi!
I am trying to understand how to create and parse UTF-8 encoded XML documents
using Xerces.
But so far I have failed. For example the following piece of code throws up in
the parse method,
due to an illegal character. To me it looks like the serializer isn't using
UTF-8 even though it says so.
I have tried to tell it to use UTF-8, same result. I have tried to set the
OutputFormat to
UTF-8, same result. How is it suppose to work?
package dom;
import org.w3c.dom.*;
import org.apache.xerces.dom.*;
import org.apache.xml.serialize.*;
import java.io.*;
import javax.xml.parsers.*;
public class TstEnc
{
public static void main( String[] argv ) {
try
{
DocumentBuilder builder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.newDocument();
Element root = doc.createElement("�ke");
doc.appendChild( root );
OutputFormat format = new OutputFormat( doc );
StringWriter stringOut = new StringWriter();
XMLSerializer serial = new XMLSerializer( stringOut, format );
serial.asDOMSerializer();
serial.serialize( doc.getDocumentElement() );
FileOutputStream file1 = new FileOutputStream("doc.xml");
file1.write(stringOut.toString().getBytes());
file1.close();
File file2 = new File("doc.xml");
doc = builder.parse(file2);
} catch ( Exception ex ) {
System.out.println("Error: " + ex.getMessage());
}
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]