Hi

You cant really compare the 2 approaches.
In the pure java code you just have a for loop in a single method
which is as fast as you can go.
When using Camel routes and the EIPs then a lot more goes on under the hood.


On Sun, Jul 21, 2013 at 10:12 PM, Viktor Kubinec <vikto...@seznam.cz> wrote:
> Thanks Willem.
>
> I've created custom splitter, which is using BufferedReader, just as you
> suggested. It has improved the performance a bit but it is still much slower
> (10-20x) than the java code that I posted. I have spent bit more time
> investigating this problem. The code (with documentation on how to use it)
> is available here :
>
> https://github.com/Viktor-Kubinec/Camel
>
> I am using Apache Camel verison 2.11.1. My test file is ~1Gb large, contains
> 100M rows with message "Test Line" in each row. I have two testing scenarios
> :
>
> 1. SplitTest - I just read big file containing short lines, split it line by
> line with my custom splitter and logging the process (every 50k-th message)
>
> 2. Same as 1. but I've added 10 processors that do nothing (because I
> observed that adding processors reduces performance)
>
> Results :
>
> java code : 1.5-2M lines per second - no profiling done
>
> 1. 150k lines per second and here are some hot spots (from VisualVM) :
>
> java.lang.Class.getSimpleName()  (57.3%)
> org.apache.camel.processor.MulticastProcessor.doAggregate()  (39.2%)
> java.io.BufferedReader.readLine()        (3.1%)
> ...
>
> 2. 72k lines per second :
>
> org.apache.camel.processor.MulticastProcessor.doAggregate()   (70.5%)
> java.lang.Class.getSimpleName()  (28.8%)
> java.io.BufferedReader.readLine()  (0.3%)
> ...
>
> I have saved snapshots from VisualVM - they are available in the github
> repository (link above) : SplitTest.nps, SplitTestWithProcessors.nps. More
> details are available there.
>
> This was done on my laptop. I've read that VisualVM has some effect on
> performance (~5%). java.lang.Class.getSimpleName() is being called in
> constructor of org.apache.camel.impl.DefaultUnitOfWork. In first scenario
> only 3.1% of thread time was spent in the readLine() method. In second
> scenario it was only 0.3%. That is very low.
>
> Is there a way how I can optimize this?
>
> Thanks,
>
> Viktor
>
>
>
>
>
> --
> View this message in context: 
> http://camel.465427.n5.nabble.com/Message-Processing-Performance-while-splitting-tp5735824p5735979.html
> Sent from the Camel - Users mailing list archive at Nabble.com.



-- 
Claus Ibsen
-----------------
Red Hat, Inc.
Email: cib...@redhat.com
Twitter: davsclaus
Blog: http://davsclaus.com
Author of Camel in Action: http://www.manning.com/ibsen

Reply via email to