XMLConverter and default charset

Arjan Moraal Thu, 06 Mar 2008 03:32:51 -0800

The org.apache.camel.converter.jaxp.XMLConverter class has a method to
convert a String to a DOM Document. This method is automatically called when
for instance an XPath expression is run on a TextMessage received from the
JMS.


    @Converter
    public Document toDOMDocument(String text) throws IOException,
SAXException, ParserConfigurationException {
        return toDOMDocument(text.getBytes());
    }

The problem with this is that the String is converted to a byte[] using the
default character encoding of the platform (in my case CP-1252 on
WindowsXP). But the XML in the text message might have a different encoding
attribute in the header (<?xml version="1.0" encoding="UTF-8"?>), which can
cause SAXParser exceptions (Like: Invalid byte 1 of 1-byte UTF-8 sequence).

So shouldn't this toDOMDocument() method use either the encoding defined in
the XML to convert the String to byte[]? 
Or change the encoding attribute in the XML header to the character encoding
used to generate the byte[]?

Thanks,
Arjan

-- 
View this message in context: 
http://www.nabble.com/XMLConverter-and-default-charset-tp15871372s22882p15871372.html
Sent from the Camel - Users mailing list archive at Nabble.com.

XMLConverter and default charset

Reply via email to