Hi Claus, I'm afraid it would difficult to produce a code snippet that would be relevant to the problem without disclosing sensitive code.
I have however made further analysis. First I wasn't exactly correct when I said using parallel processing makes no difference. When I run the version without concurrency (camel thread + 0 thread) it takes 38s. When I run with a thread pool of 1 (1 camel thread + 1 thread) it takes 26s. Then adding more threads to the pool doesn't improve the performance and eventually make it worse after about 8 threads (probably overhead of thread context switching). (this is using split size of 1000) So to me it seems that most of the work is actually done splitting the file and therefore there is little to be gain by adding other threads to process the lines. I have also made an interesting run with a split size of (1m, file size ~4m) and a pool of 4 threads. The thread activity looks like that: <http://camel.465427.n5.nabble.com/file/n5737423/CamelParallelProcessing3.jpg> We can see each thread of the pool processing 1m lines but they don't seem to interleave very well. I have also separately tested the piece of code that processes the lines of the files and it scales well up until 4 threads (about 2.5x speed up). All of that is a bit confusing but it seems that splitting the files is the most consuming task. Is there a way in Camel to leverage concurrency to split the file? PS: btw I'm running my test on a 16 cores/32 threads machine (2x Intel Xeon E5-2650) -- View this message in context: http://camel.465427.n5.nabble.com/Parallel-processing-of-big-file-tp5737386p5737423.html Sent from the Camel - Users mailing list archive at Nabble.com.