0 down vote favorite I have a large chunk of CSV files(Each containing around millions of records). So I use seda to use the multi-threading feature. I split 50000 in chunks, process it and get a List of Entity objects, which I want to split and persist to DB using jpa. Initially I was getting a Out of Heap Memory Exception. But later I used a high configuration system and Heap issue was solved.
But right now the issue is, I am getting duplicate records getting inserted in the DB. say if there are 1000000 records in the csv, around 2000000 records are getting inserted to DB. There is no primary key for the records in the Csv files. So I have used hibernate to generate a primary key for it. Below is my code (came-context.xml) <camelContext xmlns="http://camel.apache.org/schema/spring"> <route> <from uri="file:C:\Users\PPP\Desktop\input?noop=true" /> <to uri="seda:StageIt" /> </route> <route> <from uri="seda:StageIt?concurrentConsumers=1" /> <split streaming="true"> <tokenize token="\n" group="50000"></tokenize> <to uri="seda:WriteToFile" /> </split> </route> <route> <from uri="seda:WriteToFile?concurrentConsumers=8" /> <setHeader headerName="CamelFileName"> <simple>${exchangeId}</simple> </setHeader> <unmarshal ref="bindyDataformat"> <bindy type="Csv" classType="target.bindy.RealEstate" /> </unmarshal> <split> <simple>body</simple> <to uri="jpa:target.bindy.RealEstate"/> </split> </route> Please help. -- View this message in context: http://camel.465427.n5.nabble.com/Duplicate-values-from-csv-are-inserted-to-DB-using-Apache-Camel-tp5779683.html Sent from the Camel Development mailing list archive at Nabble.com.