Hi Yeah would be good if you can try the suggestions from Antoine. And if you can reproduce an unit test and possible provide a fix in a PR / patch. We love contributions http://camel.apache.org/contributing
On Tue, Mar 8, 2016 at 12:53 AM, Antoine Toulme <[email protected]> wrote: > What happens is that your default charset is win-1251 while the file is UTF-8. > > The file is read correctly according to the charset argument passed to the > toInputStream method ; however, the default charset used to parse and send > the stream is the default charset. > > The immediate workaround for you is to add an explicit charset when launching > the JVM: -Dfile.encoding=UTF-8 > > I would recommend you go ahead, file a bug and add a simple test case in > IOConverterTest around line 83. > >> On Mar 5, 2016, at 11:05 PM, fedd <[email protected]> wrote: >> >> I made an experiment and saw that the situation is much worse that just >> losing one frequent Russian letter. >> >> I made a UTF-8 file with both Russian text and one German A Umlaut letter, >> and Camel was unable to read a German letter replacing it with a question >> mark, just because my windows dev machine native charset happened to be >> win-1251. >> >> I don't really think it's okay >> >> 1) to ever flatten Unicode strings to a single byte character set; >> >> 2) when the behaviour of the server side code depends on the host operating >> system settings (becomes not portable) >> >> May I file a Jira bug report? >> >> Here's by route: >> >> <dataFormats> >> <json id="jack" library="Jackson" prettyPrint="true"/> >> </dataFormats> >> >> <route> >> >> <from >> uri="file:///C:/tries/collApp/exchange/in?fileName=registerSampleUtf.csv&charset=UTF-8"/> >> <log message="file: ${body.class.name} ${body}" >> loggingLevel="WARN"/> >> <unmarshal> >> <csv delimiter=";" useMaps="true" /> >> </unmarshal> >> <log message="unmarshalled: ${body.class.name} ${body}" >> loggingLevel="WARN"/> >> <marshal ref="jack"/> >> <log message="marshalled: ${body}" loggingLevel="WARN"/> >> <to >> uri="file:///C:/tries/collApp/exchange/out?fileName=out.json"/> >> </route> >> >> At the first "log" only a German letter is replaced with the question mark. >> >> At the second, all Russian letters are replaced with the question marks. >> >> The resulting JSON can't even display the question marks when read in any of >> the world's encodings. >> >> Shall I provide a test CSV file here? (warning: it contains Russian letters) >> >> >> >> -- >> View this message in context: >> http://camel.465427.n5.nabble.com/A-possible-bug-in-IOConverter-with-Win-1251-charset-tp5778665p5778666.html >> Sent from the Camel Development mailing list archive at Nabble.com. > -- Claus Ibsen ----------------- http://davsclaus.com @davsclaus Camel in Action 2: https://www.manning.com/ibsen2
