ok. For the StringBuilder you might be able to avoid that. As you have a 
ByteArrayOutputStream already you can either get the exchange body as a byte 
array, an input stream an output stream … 
(http://camel.apache.org/type-converter.html) and write/append the bytes 
directly, copy the byte arrays together …

In addition if you have multiple input lines can't MapForce transform all lines 
of your exchange at one so the output from there are already multiple xml 
records?

The numbers from the sample were enhanced by WRITING multiple lines at once 
using an Aggregator instead of writing line per line into the file. So the IO 
issue was not READING the original file. That's why I'm not sure if you have an 
IO issue or you should verify that reading the file is slow.

Maruan Sahyoun

Am 22.02.2013 um 09:05 schrieb cristisor <cristisor...@yahoo.com>:

> I will try to provide the steps that are in the current version:
> 1. read one line from the file, set it as the outbound message's body of an
> exchange, and, according to the file type, send the exchange to an activemq
> queue
> 2. the exchange will arrive on another service unit that has a processor
> which creates an input stream from that line and sends it to an xml mapper
> generated using Altova MapForce 2011 (as I mentioned before, I didn't choose
> the mapper but they say it's extremely fast). This mapper returns a
> ByteArrayOutputStream output containing an xml string that represents the
> mapping of some values from the read line to actual xml fields. As a basic
> example, number 200 from the line will be mapped to
> <InitialAmount>200</InitialAmount>. The xml gets set as the outbound
> message's body of the exchange and the exchange is being sent to another
> queue
> 3. when the exchange with the xml is received it gets sent to another
> processor, with another generated mapper, that maps this xml to another xml,
> for example <InitialAmount>200</InitialAmount> to <Quantity>200</Quantity>.
> This is just a simple example but the mapping can be more complex. The final
> xml string is set as the outbound message's body of the exchange and the
> exchange is being sent to the final service unit.
> 4. the final service unit picks up the exchange, unmarshals it's body into
> an actual db value object and inserts that object into the db
> 
> When I get the OOME I actually append each ByteArrayOutputStream output's
> toString() to a StringBuilder. I have to do this because I get 500 lines
> from the file, I map each of them into an xml in a while loop and I have no
> idea how to send each xml into an exchange so I append everything and set
> the final result to the exchange's outbound message body. If I could send
> each xml after I map it, instead of appending it, and map another one inside
> the same process method it would be perfect, it would be the answer to my
> problem.
> 
> I want to implement batch inserts/updates on the db to increase the
> performance and I also want to read hundreds of lines from the text file but
> at a certain point send the mapped xml in exchanges one by one, not all of
> them at the same time. I think that I/O operations take a lot of time, just
> like in "Parsing large Files with Apache Camel" from catify.com where he
> raised the number of read lines per second from 200 to 4000 by reading in
> batches instead of per line.
> 
> 
> 
> --
> View this message in context: 
> http://camel.465427.n5.nabble.com/Large-file-processing-with-Apache-Camel-tp5727977p5728001.html
> Sent from the Camel - Users mailing list archive at Nabble.com.

Reply via email to