Exactly like the example shown in Chapter 10 of Camel in Action, I need to process a large csv file (~4m lines).
Without parallel processing it takes about 30 seconds to process the file. I had good hope when I discovered the parallel processing feature in Camel. However it doesn't not improve the processing time at all (sometime it's even worse). I'd like to know whether I'm doing something stupid in Camel or if it is a problem in my code. Here is my route: ExecutorService threadPool = Executors.newFixedThreadPool(10); String token = "\r\n"; int splitSize = 1000; from("file:myBigFile.csv"). routeId("route4"). split().tokenize(token, splitSize).streaming().executorService(threadPool). process(myProcessor). //myProcessor is a custom processor that create object from a csv line and processes it accordingly filter().header("CamelSplitComplete").//on the last line process(new RouteStarterProcessor(context, "route5")).end();//start the next route to process the next file I'm processing several files in sequential order. When I know I'm processing the last line, I use a custom processor (RouteStarterProcessor) to start the route processing the next file. When I profile my application, I can see the 10 threads of the pool but they are doing very little work (running 5%~10% out of the 30s of processing). However the Camel thread for this route is running 100% of the total processing time. Looking at the profiler, a lot of time is spent on the org.apache.camel.processor.MulticastProcessor$AggregateOnTheFlyTask.aggregateOnTheFly() method. The code behind MyProcessor should be able to concurrently process the data in a fairly efficiently way (mainly adding object to Map and collections and use of Concurrent Collections rather than synchronised locks). It could be an issue in my code but I'd like to know if someone can spot a mistake in my camel route. Am I doing something that implicitly creates a thread barrier and prevent the tasks from being executed concurrently? Thanks -- View this message in context: http://camel.465427.n5.nabble.com/Parallel-processing-of-big-file-tp5737386.html Sent from the Camel - Users mailing list archive at Nabble.com.