Thanks Willem. I've created custom splitter, which is using BufferedReader, just as you suggested. It has improved the performance a bit but it is still much slower (10-20x) than the java code that I posted. I have spent bit more time investigating this problem. The code (with documentation on how to use it) is available here :
https://github.com/Viktor-Kubinec/Camel I am using Apache Camel verison 2.11.1. My test file is ~1Gb large, contains 100M rows with message "Test Line" in each row. I have two testing scenarios : 1. SplitTest - I just read big file containing short lines, split it line by line with my custom splitter and logging the process (every 50k-th message) 2. Same as 1. but I've added 10 processors that do nothing (because I observed that adding processors reduces performance) Results : java code : 1.5-2M lines per second - no profiling done 1. 150k lines per second and here are some hot spots (from VisualVM) : java.lang.Class.getSimpleName() (57.3%) org.apache.camel.processor.MulticastProcessor.doAggregate() (39.2%) java.io.BufferedReader.readLine() (3.1%) ... 2. 72k lines per second : org.apache.camel.processor.MulticastProcessor.doAggregate() (70.5%) java.lang.Class.getSimpleName() (28.8%) java.io.BufferedReader.readLine() (0.3%) ... I have saved snapshots from VisualVM - they are available in the github repository (link above) : SplitTest.nps, SplitTestWithProcessors.nps. More details are available there. This was done on my laptop. I've read that VisualVM has some effect on performance (~5%). java.lang.Class.getSimpleName() is being called in constructor of org.apache.camel.impl.DefaultUnitOfWork. In first scenario only 3.1% of thread time was spent in the readLine() method. In second scenario it was only 0.3%. That is very low. Is there a way how I can optimize this? Thanks, Viktor -- View this message in context: http://camel.465427.n5.nabble.com/Message-Processing-Performance-while-splitting-tp5735824p5735979.html Sent from the Camel - Users mailing list archive at Nabble.com.