On Fri, Feb 22, 2013 at 5:35 PM, Claus Ibsen <claus.ib...@gmail.com> wrote: > Hi > > Have you seen the splitter with group N lines together section at > http://camel.apache.org/splitter.html >
Ah yeah you use an older Camel release. You can implement a custom expression that does what this functionality in Camel 2.10 offers. You can peak at the Camel source code to see how you can do that. Basically just create a class that has a method that returns a java.util.Iterator and then return data in bulk of 500 lines. Then the Camel splitter in streaming mode will use that to walk the file. > > On Thu, Feb 21, 2013 at 10:10 PM, cristisor <cristisor...@yahoo.com> wrote: >> Hello everybody, >> >> I'm using Apache Fuse ESB with Apache Camel 2.4.0 (I think) to process some >> large files. Until now a service unit deployed in servicemix would read the >> file line by line, create and send an exchange containing that line to >> another service unit that would analyze the line and transform it into an >> xml according to some parameters, then send the new exchange to a new >> service unit that would map that xml to another xml format and send the new >> exchange containing the new xml to a final service unit that unmarshals the >> xml and inserts the object into a database. I arrived on the project, the >> architecture and the design are not mine, and I have to fix some serious >> performance problems. I suspect that reading the files line by line is >> slowing the processing very much, so I inserted an AggregationStrategy to >> aggregate 100 - 200 lines at once. Here I get into trouble: >> - if I send an exchange with more than 1 line I have to make a lot of >> changes on the xml to xml mappers, choice processors, etc >> - even if I solve the first problem, if I read 500 lines at once and I >> create a big xml from the data I get into an OOME exception, so I should >> read up to 50 lines in order to make sure that no exceptions will arise >> >> What I'm looking for is a way to read 500 - 1000 lines at once but send each >> one in a different exchange to the service unit that creates the initial >> xml. My route looks similar to this one now: >> >> from("file://myfile.txt") >> .marshal().string("UTF-8") >> .split(body().tokenize("\n")).streaming() >> .setHeader("foo", constant("foo")) >> .aggregate(header("foo"), >> new >> StringBodyAggregator()).completionSize(50) >> .process(processor) >> .to("activemq queue"); >> >> I read something about a template producer but I'm not sure if it can help >> me. Basically I want to insert a mechanism to send more than one exchange, >> one for each read line, to the processor and then to the endpoint. This way >> I read from the file in batches of hundreds or thousands and I keep using >> the old mechanism for mapping, one line at a time. >> >> Thank you. >> >> >> >> -- >> View this message in context: >> http://camel.465427.n5.nabble.com/Large-file-processing-with-Apache-Camel-tp5727977.html >> Sent from the Camel - Users mailing list archive at Nabble.com. > > > > -- > Claus Ibsen > ----------------- > Red Hat, Inc. > FuseSource is now part of Red Hat > Email: cib...@redhat.com > Web: http://fusesource.com > Twitter: davsclaus > Blog: http://davsclaus.com > Author of Camel in Action: http://www.manning.com/ibsen -- Claus Ibsen ----------------- Red Hat, Inc. FuseSource is now part of Red Hat Email: cib...@redhat.com Web: http://fusesource.com Twitter: davsclaus Blog: http://davsclaus.com Author of Camel in Action: http://www.manning.com/ibsen