Thanks Willem.

I've created custom splitter, which is using BufferedReader, just as you
suggested. It has improved the performance a bit but it is still much slower
(10-20x) than the java code that I posted. I have spent bit more time
investigating this problem. The code (with documentation on how to use it)
is available here :

https://github.com/Viktor-Kubinec/Camel

I am using Apache Camel verison 2.11.1. My test file is ~1Gb large, contains
100M rows with message "Test Line" in each row. I have two testing scenarios
:

1. SplitTest - I just read big file containing short lines, split it line by
line with my custom splitter and logging the process (every 50k-th message)

2. Same as 1. but I've added 10 processors that do nothing (because I
observed that adding processors reduces performance)

Results : 

java code : 1.5-2M lines per second - no profiling done

1. 150k lines per second and here are some hot spots (from VisualVM) :

java.lang.Class.getSimpleName()  (57.3%)        
org.apache.camel.processor.MulticastProcessor.doAggregate()  (39.2%)
java.io.BufferedReader.readLine()        (3.1%)
...

2. 72k lines per second :

org.apache.camel.processor.MulticastProcessor.doAggregate()   (70.5%)
java.lang.Class.getSimpleName()  (28.8%)
java.io.BufferedReader.readLine()  (0.3%)
...

I have saved snapshots from VisualVM - they are available in the github
repository (link above) : SplitTest.nps, SplitTestWithProcessors.nps. More
details are available there.

This was done on my laptop. I've read that VisualVM has some effect on
performance (~5%). java.lang.Class.getSimpleName() is being called in
constructor of org.apache.camel.impl.DefaultUnitOfWork. In first scenario
only 3.1% of thread time was spent in the readLine() method. In second
scenario it was only 0.3%. That is very low.

Is there a way how I can optimize this?

Thanks,

Viktor





--
View this message in context: 
http://camel.465427.n5.nabble.com/Message-Processing-Performance-while-splitting-tp5735824p5735979.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Reply via email to