Re: Improving the Aggregator

Claus Ibsen Sun, 16 Dec 2012 22:14:03 -0800

On Fri, Dec 7, 2012 at 9:25 AM, Claus Ibsen <claus.ib...@gmail.com> wrote:
> On Thu, Dec 6, 2012 at 11:56 PM, Christian Müller
> <christian.muel...@gmail.com> wrote:
>> It would be great to improve the Aggregator to support a completion size
>> which evaluates ALL exchanges (from all different aggregates).
>> We ofter have the requirement to split a file which contains items which we
>> process independently. At the end, we have to aggregate the items based on
>> the customer.
>>
>> As an example, we have one input file with 1000 items which will result in
>> 2 output files with 300 and 700 items (or 3 output files with 273, 493, 234
>> items - or 4 output files with ... I think you got it). At the end, the
>> aggregator will receive in total 1000 files which should be trigger a flush
>> on ALL existing aggregates.
>>
>
>> What do you think?
>
>
>> Is there already a simple solution for this which I miss?
>>
>
> Yes.
>
> The EIPs are composable and you can build solutions like lego bricks.
>
> So what you should look at is the composed message processor
> http://camel.apache.org/composed-message-processor.html
> ... to do the rough split + aggregate, into eg 2-4 files.
>
> And then afterwards you can split.
>


Christian did you have a chance to look at this EIP.
I was referring to only using the splitter as it has built-in aggregation.
As you only have 1000 lines in the files, you can keep it in-memory,
and then the 1st splitter
will just do the coarse grained splitting into 2,3 .. or 4 "files".
You can do this in the AggregationStrategy
to store a Map with a key for the filename / correlation id


A pseudo route can be something like:

from file:inbox
  split myCoarseGrainedAggregator
    process myFileNameCorrelation
  end
  split body
     to file:outbox
  end

In the 1st splitter we use an aggregation strategy to split into the
2,3 or 4 files and set the data on the outgoing exchange.
As the splitter expects at least 1 child, then we use a processor to
compute a header with the filename for the given splitted line.

In the 2nd splitter we split those 2,3,4 files and store each file, or
whatever you want to do.





>
>
>> At present we use a combination of two aggregators to do this. Each will
>> receive a copy of an item. The second aggregator only counts the total
>> number of all received exchanges (we are using the same aggragation key
>> value for this aggregator). If this aggragor flushes the exchange, we send
>> a "command message" to the first aggregator to flush all the aggregates.
>>
>> I think there should be a simpler solution out of the box which Camel
>> should offer. E.g. completionRepositorySize or repositoryCompletionSize
>>
>> Best,
>> Christian
>
>
>
> --
> Claus Ibsen
> -----------------
> Red Hat, Inc.
> FuseSource is now part of Red Hat
> Email: cib...@redhat.com
> Web: http://fusesource.com
> Twitter: davsclaus
> Blog: http://davsclaus.com
> Author of Camel in Action: http://www.manning.com/ibsen



-- 
Claus Ibsen
-----------------
Red Hat, Inc.
FuseSource is now part of Red Hat
Email: cib...@redhat.com
Web: http://fusesource.com
Twitter: davsclaus
Blog: http://davsclaus.com
Author of Camel in Action: http://www.manning.com/ibsen

Re: Improving the Aggregator

Reply via email to