Hi Michele
Reading a CSV with 40k lines using camel in streaming takes a view
seconds. As you limit the queue-size to avoid OOM the entire performance
depends how fast you can empty the queue.
How long does processing of ONE message take in average? To me it looks
like approximately 1.6 secs (35000/6/60/60). The processes responsible
for reading the queue is single-threaded??
Jens
Am 15/04/16 um 14:59 schrieb Michele:
Hi,
I spent a bit of time reading different topics on this issue, and I changed
my route like this reducing the memory usage of about 300Mb:
<route id="FileRetriever_Route">
<from
uri="{{uri.inbound}}?scheduler=quartz2&scheduler.cron={{poll.consumer.scheduler}}&scheduler.triggerId=FileRetriever&scheduler.triggerGroup=IF_CBIKIT{{uri.inbound.options}}"
/>
<setHeader
headerName="ImportDateTime"><simple>${date:now:yyyyMMdd-HHmmss}</simple></setHeader>
<setHeader
headerName="MsgCorrelationId"><simple>CBIKIT_INBOUND_${in.header.ImportDateTime}</simple></setHeader>
<setHeader headerName="breadcrumbId">
<simple>Import-${in.header.CamelFileName}-${in.header.ImportDateTime}-${in.header.breadcrumbId}</simple>
</setHeader>
<to uri="seda:processAndStoreInQueue" />
<log message="END - FileRetriever_Route" />
</route>
<route id="ProcessAndStoreInQueue_Route">
<from uri="seda:processAndStoreInQueue" />
<unmarshal>
<bindy type="Csv"
classType="com.fincons.ingenico.crt2.cbikit.inbound.model.RowData"/>
</unmarshal>
<split streaming="true"
executorServiceRef="myThreadPoolExecutor" >
<simple>${body}</simple>
<choice>
<when>
<simple></simple>
<setHeader
headerName="CamelSplitIndex"><simple>${in.header.CamelSplitIndex}</simple></setHeader>
<process
ref="BodyEnricherProcessor" />
<to
uri="dozer:transform?mappingFile=file:{{crt2.apps.home}}{{dozer.mapping.path}}&targetModel=com.fincons.ingenico.crt2.cbikit.inbound.model.SerialNumber"
/>
<marshal ref="Gson" />
<to uri="activemq:queue:CBIKIT"
/>
</when>
<otherwise>
<log message="Message discarded
${in.header.CamelSplitIndex} -
${body}" />
</otherwise>
</choice>
</split>
</route>
The last test processed 35000 lines of CSV file in about 6h with an average
memory usage 1400Mb successful. But, Can I improve further processing
performance?
In addition, I noticed that Queue Size of Queue is low. Why? (Producer is
slower than Consumer?)
Thanks in advance.
Best Regards
Michele
--
View this message in context:
http://camel.465427.n5.nabble.com/Best-Strategy-to-process-a-large-number-of-rows-in-File-tp5779856p5781168.html
Sent from the Camel - Users mailing list archive at Nabble.com.