Re: [IDEA] great aggregation without a database, or performing reliable map/reduce using Message Groups...

Hiram Chirino Mon, 14 Jan 2008 07:45:46 -0800

I like it :)

On Jan 13, 2008 11:24 AM, Guillaume Nodet <[EMAIL PROTECTED]> wrote:
> That would really awesome :-)
>
> Aggregate messages in a reliable and efficient way is always a pain
> and having such a feature would handle most of the usecases imho.
>
> For long running transactions, we might need the 'dispatch when the
> message group is complete' feature.  Even more complex would be to
> maybe use some kind of selector so that the broker knows when the
> message group is complete without relying on the number of messages.
> For example, you could use a time based expiration so that the broker
> sends all the message when the expiration date has been reached or
> even combine predicates using the selector syntax.
> But I guess this is really more advanced stuff that we don't really need
> right now ... ;-)
>
>
> On Jan 13, 2008 5:12 PM, James Strachan <[EMAIL PROTECTED]> wrote:
>
> > ... I just wanted to explain an awesome idea that popped out of a bit
> > of brainstorming while sat at breakfast with Guillaume today.
> > Aggregating messages reliably, over long periods of time in a high
> > performance way is kinda sucky right now. Either you use batches like
> > the default Camel Aggregator
> > http://activemq.apache.org/camel/aggregator.html
> >
> > (which is a bit sucky) or you have to use a database (and then hit 2
> > phase commit type issues).
> >
> > The problem basically is that you want a consumer to process an entire
> > group of messages, in order, in a single transaction to avoid having
> > to use persistence; if they fail the entire group of messages need to
> > be redelivered (maybe to another consumer) in full - otherwise you
> > have to use persistence to maintain partial state.
> >
> > Also you only want a single consumer to get a single group of messages
> > at once as the consumer can only commit a single group of messages at
> > a time (it can't interleave them).
> >
> > So its kinda like the consumer wants to do a selector that kinda says
> > 'send me one complete message group including the last closing message
> > of the sequence, in order please'.
> >
> > At first we pondered about selectors - then we had the idea of having
> > a kind of 'exclusive message group'; namely that using Message
> > Groups...
> > http://activemq.apache.org/message-groups.html
> >
> > we could maybe limit the consumer to only support one message group at
> > once until its closed, then it can consume another message group. This
> > in itself is a pretty good solution that could work today fairly
> > easily (we just need the chooser code associating a new message group
> > to a consumer to ignore consumers that already have a message group
> > associated with them if they have 'exclusive message group' enabled).
> >
> > The main downside with this approach is that a single consumer can
> > then get locked by a single message group, that could span hours, days
> > or weeks - unable to process any more messages until finally the
> > message group is completed.
> >
> > So what would be really nice would be is if we supported the
> > JMSXGroupSeq header (for sequence numbers within a message group) and
> > made it possible to not dispatch any messages within a message group,
> > until the sequence is complete; so they'd stay around in the broker
> > until the sequence can be processed, in one brief amount of time by a
> > single consumer. Also we should support reordering of messages within
> > the sequence as well if the messages get out of order. Then folks
> > could do long term aggregation using purely ActiveMQ in a nice high
> > performance way!
> >
> > Another added benefit would be folks could do a totally asynchronous,
> > loosely coupled and reliable Map/Reduce pattern using purely ActiveMQ.
> > e.g. we can split a single message using the splitter
> > http://activemq.apache.org/camel/splitter.html
> >
> > using the message ID as the JMSXGroupID and for each child message we
> > assign a JMSXGroupSeq until the last message we close the sequence.
> > Then each message can be processed by any consumer in a grid, sending
> > replies back to another queue using the same JMSXGroupID and
> > JMSXGroupSeq. Then when all the messages are received and the sequence
> > is complete, the entire sequence of response messages is sent to a
> > single consumer; who then commits its transaction when the last
> > message in the sequence is processed. All without any explicit
> > persistence - yet the whole thing would be totally transactional and
> > reliable! Not bad eh?
> >
> > As a first stab, it'd be nice to support the 'exclusive message group'
> > idea which seems pretty easy to do; as then at least we'd have an
> > awesome solution for Map/Reduce scenarios which complete within a
> > relatively short amount of time; we'd only need the more advanced
> > 'dispatch when the message group is complete' option for dealing with
> > very long running Map/Reduce problems - which are probably fairly
> > rare.
> >
> > Thoughts?
> > --
> > James
> > -------
> > http://macstrac.blogspot.com/
> >
> > Open Source Integration
> > http://open.iona.com
> >
>
>
>
> --
> Cheers,
> Guillaume Nodet
> ------------------------
> Blog: http://gnodet.blogspot.com/
>




-- 
Regards,
Hiram

Blog: http://hiramchirino.com

Open Source SOA
http://open.iona.com

Re: [IDEA] great aggregation without a database, or performing reliable map/reduce using Message Groups...

Reply via email to