We have a route which calls a SOAP web service. The return message contains UTF-8 encoded content. For some reason this results in the following exception. I wonder what we're doing wrong?
2014-01-01 15:13:01,375 | INFO | ler-ura_Worker-1 | JobRunShell | 216 - org.apache.servicemix.bundles.quartz - 1.8.6.1 | Job DEFAULT.quartz-endpoint82 threw a JobExecutionException: org.quartz.JobExecutionException: java.io.IOException: javax.xml.bind.UnmarshalException - with linked exception: [com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 middle byte 0x3c (at char #408, byte #127)] [See nested exception: java.io.IOException: javax.xml.bind.UnmarshalException - with linked exception: [com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 middle byte 0x3c (at char #408, byte #127)]] at org.apache.camel.component.quartz.QuartzEndpoint.onJobExecute(QuartzEndpoint.java:117)[218:org.apache.camel.camel-quartz:2.10.6] at org.apache.camel.component.quartz.CamelJob.execute(CamelJob.java:61)[218:org.apache.camel.camel-quartz:2.10.6] at org.quartz.core.JobRunShell.run(JobRunShell.java:223)[216:org.apache.servicemix.bundles.quartz:1.8.6.1] at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)[216:org.apache.servicemix.bundles.quartz:1.8.6.1] Caused by: java.io.IOException: javax.xml.bind.UnmarshalException - with linked exception: [com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 middle byte 0x3c (at char #408, byte #127)] at org.apache.camel.converter.jaxb.JaxbDataFormat.unmarshal(JaxbDataFormat.java:153)[222:org.apache.camel.camel-jaxb:2.10.6] at org.apache.camel.dataformat.soap.SoapJaxbDataFormat.unmarshal(SoapJaxbDataFormat.java:275)[241:org.apache.camel.camel-soap:2.10.6] at org.apache.camel.processor.UnmarshalProcessor.process(UnmarshalProcessor.java:57)[100:org.apache.camel.camel-core:2.10.6] at org.apache.camel.util.AsyncProcessorConverterHelper$ProcessorToAsyncProcessorBridge.process(AsyncProcessorConverterHelper.java:61)[100:org.apache.camel.camel-core:2.10.6] The relevant parts of our route look like this: FooRoutes.java: from("direct:foo1").routeId("foo1"). errorHandler(defaultErrorHandler(). maximumRedeliveries(3). redeliveryDelay(100). retryAttemptedLogLevel(WARN)). to(uraService). log(INFO, "1: before unmarshal: ${body}"). unmarshal(soapJaxbDataFormat). log(INFO, "2: ${body}"); foo-camel-context.xml: <bean id="soapJaxbDataFormat" class="org.apache.camel.model.dataformat.SoapJaxbDataFormat"> <property name="contextPath" value="fi.ourdomain.xsd._1"/> </bean> <camelcxf:cxfEndpoint id="uraService" address="${ura.cxf}" serviceClass="fi.ourdomain._1_0.UraPort"> <camelcxf:properties> <entry key="dataFormat" value="MESSAGE"/> </camelcxf:properties> </camelcxf:cxfEndpoint> So - we are calling SOAP WS "uraService" which we have JAX-WS generated interface "UraPort" to. Then we try to unmarshal this XML message into JAXB beans. This works fine when content does not have special characters. But when I set the content to contain for example "Ä" (U+00C4 , c3 84, LATIN CAPITAL LETTER A WITH DIAERESIS) this breaks with the previous exception. When I test the we service directly the character seems to be fine. When I inspect the response from WS with hex editor, I see the character Ä represented with bytes 0xc3 and 0x83 as I think it should be. "Ä" is the last letter of the element and is followed by "<" (byte 0x3c). The exceptions looks like it would hint that the unmashalling thinks that the letter started with 0xc3 and 0x83 does not end there but continues with 0x3c, which is wrong. Or something like that... The problematic part of the message: <HankkeenKuvaus>kuvaus jossa viimeinen merkki on skandiÄ</HankkeenKuvaus> This is OK when I access the WS directy with SOAPUI. When my route logs the first log message, the problematic character looks garbled in the log: 2014-01-01 15:13:01,062 | INFO | ault-workqueue-1 | foo1 | 100 - org.apache.camel.camel-core - 2.10.6 | !!!! ennen unmarshallia: <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><soap:Body><FooAResponse xmlns="http://ourdomain/xsd/1.0"><FooB><FooC>....<HankkeenKuvaus>kuvaus jossa viimeinen merkki on skandi�</HankkeenKuvaus> and the second log step is never reached. I dont know how camel & servimix logging works encoding-wise, this may or may not be a sign that things went wrong already when calling the service, not when trying to unmarshal it. I am using ServiceMix 4.5.2 and the bundled Camel 2.10.6. -- View this message in context: http://camel.465427.n5.nabble.com/Trying-to-consume-SOAP-WS-with-UTF-8-content-getting-Invalid-UTF-8-middle-byte-0x3c-tp5745394.html Sent from the Camel - Users mailing list archive at Nabble.com.