Hi

Have you seen the splitter with group N lines together section at
http://camel.apache.org/splitter.html


On Thu, Feb 21, 2013 at 10:10 PM, cristisor <cristisor...@yahoo.com> wrote:
> Hello everybody,
>
> I'm using Apache Fuse ESB with Apache Camel 2.4.0 (I think) to process some
> large files. Until now a service unit deployed in servicemix would read the
> file line by line, create and send an exchange containing that line to
> another service unit that would analyze the line and transform it into an
> xml according to some parameters, then send the new exchange to a new
> service unit that would map that xml to another xml format and send the new
> exchange containing the new xml to a final service unit that unmarshals the
> xml and inserts the object into a database. I arrived on the project, the
> architecture and the design are not mine, and I have to fix some serious
> performance problems. I suspect that reading the files line by line is
> slowing the processing very much, so I inserted an AggregationStrategy to
> aggregate 100 - 200 lines at once. Here I get into trouble:
> - if I send an exchange with more than 1 line I have to make a lot of
> changes on the xml to xml mappers, choice processors, etc
> - even if I solve the first problem, if I read 500 lines at once and I
> create a big xml from the data I get into an OOME exception, so I should
> read up to 50 lines in order to make sure that no exceptions will arise
>
> What I'm looking for is a way to read 500 - 1000 lines at once but send each
> one in a different exchange to the service unit that creates the initial
> xml. My route looks similar to this one now:
>
> from("file://myfile.txt")
>         .marshal().string("UTF-8")
>         .split(body().tokenize("\n")).streaming()
>                 .setHeader("foo", constant("foo"))
>                 .aggregate(header("foo"),
>                                 new
> StringBodyAggregator()).completionSize(50)
>                 .process(processor)
>                 .to("activemq queue");
>
> I read something about a template producer but I'm not sure if it can help
> me. Basically I want to insert a mechanism to send more than one exchange,
> one for each read line, to the processor and then to the endpoint. This way
> I read from the file in batches of hundreds or thousands and I keep using
> the old mechanism for mapping, one line at a time.
>
> Thank you.
>
>
>
> --
> View this message in context: 
> http://camel.465427.n5.nabble.com/Large-file-processing-with-Apache-Camel-tp5727977.html
> Sent from the Camel - Users mailing list archive at Nabble.com.



-- 
Claus Ibsen
-----------------
Red Hat, Inc.
FuseSource is now part of Red Hat
Email: cib...@redhat.com
Web: http://fusesource.com
Twitter: davsclaus
Blog: http://davsclaus.com
Author of Camel in Action: http://www.manning.com/ibsen

Reply via email to