Hi, I have a route like the following:
From(sjms2) .unmarshal().jaxb("myjaxbpackage") When I send an XML message with the following content <?xml version="1.0" encoding="ISO-8859-1"?> ... Rest of XML content here ... To the sjms2 endpoint, any Danish characters (e.g. ø) in the message get mangled. The Camel message body in this example is a String. Looking at the unmarshal implementation, it looks like Camel forces messages to InputStream (seemingly with UTF-8 encoding by default) before passing them to the JAXB data format. See https://github.com/apache/camel/blob/3312243b32af03ac39c3af170e318f03e01d64f0/core/camel-support/src/main/java/org/apache/camel/support/processor/UnmarshalProcessor.java#L56 I can work around this by converting the message body to a Latin-1 InputStream before unmarshalling, or by setting the encoding property on the data format, but I'm wondering why Camel is implemented this way? For at least JAXB unmarshalling, there is no reason to serialize a String to InputStream before handing it off to JAXB, and it is less flexible than just passing the String to JAXB, as my code now needs to decide the input message's charset, which JAXB would otherwise handle for me. In the current code, the serialization looks to be necessary because DataFormat.unmarshal takes an Exchange and an InputStream. Wouldn't it be more flexible to only pass the Exchange to the DataFormat, and leave the implementation free to check whether the message is already a format it can process before trying to serialize to bytes? For instance, the JAXB data format could check whether the input is a Reader or a String, and use the matching JAXB Unmarshaller methods.