What happens is that your default charset is win-1251 while the file is UTF-8.
The file is read correctly according to the charset argument passed to the toInputStream method ; however, the default charset used to parse and send the stream is the default charset. The immediate workaround for you is to add an explicit charset when launching the JVM: -Dfile.encoding=UTF-8 I would recommend you go ahead, file a bug and add a simple test case in IOConverterTest around line 83. > On Mar 5, 2016, at 11:05 PM, fedd <feddkr...@hotmail.com> wrote: > > I made an experiment and saw that the situation is much worse that just > losing one frequent Russian letter. > > I made a UTF-8 file with both Russian text and one German A Umlaut letter, > and Camel was unable to read a German letter replacing it with a question > mark, just because my windows dev machine native charset happened to be > win-1251. > > I don't really think it's okay > > 1) to ever flatten Unicode strings to a single byte character set; > > 2) when the behaviour of the server side code depends on the host operating > system settings (becomes not portable) > > May I file a Jira bug report? > > Here's by route: > > <dataFormats> > <json id="jack" library="Jackson" prettyPrint="true"/> > </dataFormats> > > <route> > > <from > uri="file:///C:/tries/collApp/exchange/in?fileName=registerSampleUtf.csv&charset=UTF-8"/> > <log message="file: ${body.class.name} ${body}" > loggingLevel="WARN"/> > <unmarshal> > <csv delimiter=";" useMaps="true" /> > </unmarshal> > <log message="unmarshalled: ${body.class.name} ${body}" > loggingLevel="WARN"/> > <marshal ref="jack"/> > <log message="marshalled: ${body}" loggingLevel="WARN"/> > <to > uri="file:///C:/tries/collApp/exchange/out?fileName=out.json"/> > </route> > > At the first "log" only a German letter is replaced with the question mark. > > At the second, all Russian letters are replaced with the question marks. > > The resulting JSON can't even display the question marks when read in any of > the world's encodings. > > Shall I provide a test CSV file here? (warning: it contains Russian letters) > > > > -- > View this message in context: > http://camel.465427.n5.nabble.com/A-possible-bug-in-IOConverter-with-Win-1251-charset-tp5778665p5778666.html > Sent from the Camel Development mailing list archive at Nabble.com.
smime.p7s
Description: S/MIME cryptographic signature