[ https://issues.apache.org/jira/browse/CAMEL-11846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245564#comment-16245564 ]
Robert Half edited comment on CAMEL-11846 at 11/9/17 12:36 PM: --------------------------------------------------------------- !UTF-16BE (with BOM).png! Hmmm I am not able to attach xml file here... I think you can easily create a test file by converting any UTF-8 XML to UTF-16BE with notepad++ for example. was (Author: antidote2): !UTF-16BE (with BOM).png! Hmmm I am not able to attach xml file here... > xtokenize and apply xslt to a string does not work with UTF-16BE > ----------------------------------------------------------------- > > Key: CAMEL-11846 > URL: https://issues.apache.org/jira/browse/CAMEL-11846 > Project: Camel > Issue Type: Bug > Components: camel-core > Affects Versions: 2.17.5 > Reporter: Robert Half > Attachments: UTF-16BE (with BOM).png > > > In XML, encoding is often provided inside <?xml ..?> tag. In general, you > cannot read the tag, if you don't know the encoding, but XML Parsers support > the detection of several encodings which allows them to read the tag. With > that information they can read the whole file without knowing the "charset" > in first place. > xtokenize and xslt use XmlInputFactory#createXmlStreamReader(Reader). But by > providing a reader Camel tells, that it knows the encoding, so it won't be > detected by the XML parser. > Also Camel sets the charset to UTF-8 if it is not provided inside a header. > This makes the underlying reader fail reading UTF-16. > Using XmlInputFactory#createXmlStreamReader(InputStream) inside > XMLTokenExpressionIterator works (tried in a patch). But the next xslt steps > fails again because it again uses a Reader. > See Stackoverflow Question for reference: > [https://stackoverflow.com/questions/46322376/apache-camel-to-handle-encoding-declared-in-xml-file] -- This message was sent by Atlassian JIRA (v6.4.14#64029)