After some debugging, I found out the following: - Standard encoding used to serialize is UTF-8 - For the toString() method, everything's serialized to UTF-8, then a string is created from the bytes, without specifying an encoding, which in my case, is probably iso-latin-1
In my particular case, the pound sign is serialized to -62,-93 which is 11000010 10100100, and the UTF-8 representation for the pound symbol (00010100100); in iso-latin-1, however, this bit-sequence represents a special A-character and the pound signal (which is, I'd say, coincidental). I'm using Axiom in combination with Axis2 and this is causing a problem. Can I do something about this? I don't know how Axis2 uses Axiom, but the weird signals get outputted as well... Brecht -----Original Message----- From: Brecht Yperman [mailto:[EMAIL PROTECTED] Sent: dinsdag 10 oktober 2006 13:35 To: [email protected] Subject: RE: Weird character in XML string The Java sample code got removed, I'll add it to the body. import java.io.File; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.io.Reader; import java.io.StringBufferInputStream; import java.io.StringWriter; import java.io.Writer; import javax.xml.stream.XMLStreamException; import org.apache.axiom.om.OMElement; import org.apache.axiom.om.impl.builder.StAXBuilder; import org.apache.axiom.om.impl.builder.StAXOMBuilder; public class TestReadXml { public static void main(String[] args) throws IOException, XMLStreamException { String test = readFileToString(new File("bin/result.xml"), "ISO8859_1"); System.out.println(test); // XMLStreamReader parser = // XMLInputFactory.newInstance().createXMLStreamReader(ir); InputStream is = new StringBufferInputStream(test); StAXBuilder builder = new StAXOMBuilder(is); OMElement documentElement = builder.getDocumentElement(); System.out.println(documentElement.toString()); } public static String readFileToString(File file, String encoding) throws IOException { InputStream in = new java.io.FileInputStream(file); try { return toString(in, encoding); } finally { closeQuietly(in); } } public static String toString(InputStream input, String encoding) throws IOException { StringWriter sw = new StringWriter(); copy(input, sw, encoding); return sw.toString(); } public static void copy(InputStream input, Writer output, String encoding) throws IOException { InputStreamReader in = new InputStreamReader(input, encoding); copy(in, output); } public static int copy(Reader input, Writer output) throws IOException { char[] buffer = new char[DEFAULT_BUFFER_SIZE]; int count = 0; int n = 0; while (-1 != (n = input.read(buffer))) { output.write(buffer, 0, n); count += n; } return count; } private static final int DEFAULT_BUFFER_SIZE = 1024 * 4; public static void closeQuietly(InputStream input) { if (input == null) { return; } try { input.close(); } catch (IOException ioe) { } } } -----Original Message----- From: Brecht Yperman [mailto:[EMAIL PROTECTED] Sent: dinsdag 10 oktober 2006 13:33 To: [email protected] Subject: Weird character in XML string Hi, I'm reading in an XML file with the pound sign in it (0xa3). When I parse it using the StaxOMBuilder and then print the documentElement.toString to the console, a weird character (0xc2) appears in front of the pound sign. What is happening? Thanks a lot, Brecht Invenso - The "Integration Software" specialists. ____________________________________________ Brecht Yperman Development Team Direct: +32 (0)3 780 30 05 Email: [EMAIL PROTECTED] INVENSO bvba Industriepark-West 75 9100 Sint-Niklaas Belgium - Europe Phone: +32 (0)3 780 30 02 Fax: +32 (0)3 780 30 03 Email: [EMAIL PROTECTED] Website: www.invenso.com VAT BE 0477.834.668 RPR Sint-Niklaas "E-mail disclaimer: This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient, please note that any review, dissemination, disclosure, alteration, printing, copying or transmission of this e-mail and/or any file transmitted with it, is strictly prohibited and may be unlawful. If you have received this e-mail by mistake, please immediately notify the sender and permanently delete the original as well as any copy of any e-mail and any printout thereof." --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
