The following should work for you: from("file://src/test/data?fileName=crm.sample.csv&noop=true&charset=ISO-8859-1") .split(body().tokenize("\n"))
it's important to use the charset "ISO-8859-1". I will open a JIRA to improve the current behavior in Camel. If you use the wrong charset (e.g. "UTF-8" in this case), the Scanner will stop working and store the exception which we can access by calling "scanner.ioException()". We have to check this and may rethrow the exception. The following test shows this behavior: succeed: @Test public void scannerTest() throws Exception { Scanner scanner = new Scanner(new File("src/test/data/crm.sample.csv"), "ISO-8859-1"); scanner.useDelimiter("\n"); int counter = 0; while (scanner.hasNext()) { scanner.next(); ++counter; } assertEquals(289, counter); } fails: @Test public void scannerTest() throws Exception { Scanner scanner = new Scanner(new File("src/test/data/crm.sample.csv"), "UTF-8"); scanner.useDelimiter("\n"); int counter = 0; while (scanner.hasNext()) { scanner.next(); ++counter; } assertEquals(289, counter); } Hope this helps. Best, Christian On Wed, Oct 31, 2012 at 11:04 PM, Denis S <dsoukhoros...@yahoo.com> wrote: > crm.sample.csv > <http://camel.465427.n5.nabble.com/file/n5721918/crm.sample.csv> > > this is a small portion of the file. see around 260..280 lines. I'm not > sure > if the file will help you reproduce my issue: now it is in win format. > > Thanks, Denis. > > > > -- > View this message in context: > http://camel.465427.n5.nabble.com/Trouble-with-split-tokenize-on-linux-tp5721677p5721918.html > Sent from the Camel - Users mailing list archive at Nabble.com. > --