Hi You cant really compare the 2 approaches. In the pure java code you just have a for loop in a single method which is as fast as you can go. When using Camel routes and the EIPs then a lot more goes on under the hood.
On Sun, Jul 21, 2013 at 10:12 PM, Viktor Kubinec <vikto...@seznam.cz> wrote: > Thanks Willem. > > I've created custom splitter, which is using BufferedReader, just as you > suggested. It has improved the performance a bit but it is still much slower > (10-20x) than the java code that I posted. I have spent bit more time > investigating this problem. The code (with documentation on how to use it) > is available here : > > https://github.com/Viktor-Kubinec/Camel > > I am using Apache Camel verison 2.11.1. My test file is ~1Gb large, contains > 100M rows with message "Test Line" in each row. I have two testing scenarios > : > > 1. SplitTest - I just read big file containing short lines, split it line by > line with my custom splitter and logging the process (every 50k-th message) > > 2. Same as 1. but I've added 10 processors that do nothing (because I > observed that adding processors reduces performance) > > Results : > > java code : 1.5-2M lines per second - no profiling done > > 1. 150k lines per second and here are some hot spots (from VisualVM) : > > java.lang.Class.getSimpleName() (57.3%) > org.apache.camel.processor.MulticastProcessor.doAggregate() (39.2%) > java.io.BufferedReader.readLine() (3.1%) > ... > > 2. 72k lines per second : > > org.apache.camel.processor.MulticastProcessor.doAggregate() (70.5%) > java.lang.Class.getSimpleName() (28.8%) > java.io.BufferedReader.readLine() (0.3%) > ... > > I have saved snapshots from VisualVM - they are available in the github > repository (link above) : SplitTest.nps, SplitTestWithProcessors.nps. More > details are available there. > > This was done on my laptop. I've read that VisualVM has some effect on > performance (~5%). java.lang.Class.getSimpleName() is being called in > constructor of org.apache.camel.impl.DefaultUnitOfWork. In first scenario > only 3.1% of thread time was spent in the readLine() method. In second > scenario it was only 0.3%. That is very low. > > Is there a way how I can optimize this? > > Thanks, > > Viktor > > > > > > -- > View this message in context: > http://camel.465427.n5.nabble.com/Message-Processing-Performance-while-splitting-tp5735824p5735979.html > Sent from the Camel - Users mailing list archive at Nabble.com. -- Claus Ibsen ----------------- Red Hat, Inc. Email: cib...@redhat.com Twitter: davsclaus Blog: http://davsclaus.com Author of Camel in Action: http://www.manning.com/ibsen