... I just wanted to explain an awesome idea that popped out of a bit of brainstorming while sat at breakfast with Guillaume today. Aggregating messages reliably, over long periods of time in a high performance way is kinda sucky right now. Either you use batches like the default Camel Aggregator http://activemq.apache.org/camel/aggregator.html
(which is a bit sucky) or you have to use a database (and then hit 2 phase commit type issues). The problem basically is that you want a consumer to process an entire group of messages, in order, in a single transaction to avoid having to use persistence; if they fail the entire group of messages need to be redelivered (maybe to another consumer) in full - otherwise you have to use persistence to maintain partial state. Also you only want a single consumer to get a single group of messages at once as the consumer can only commit a single group of messages at a time (it can't interleave them). So its kinda like the consumer wants to do a selector that kinda says 'send me one complete message group including the last closing message of the sequence, in order please'. At first we pondered about selectors - then we had the idea of having a kind of 'exclusive message group'; namely that using Message Groups... http://activemq.apache.org/message-groups.html we could maybe limit the consumer to only support one message group at once until its closed, then it can consume another message group. This in itself is a pretty good solution that could work today fairly easily (we just need the chooser code associating a new message group to a consumer to ignore consumers that already have a message group associated with them if they have 'exclusive message group' enabled). The main downside with this approach is that a single consumer can then get locked by a single message group, that could span hours, days or weeks - unable to process any more messages until finally the message group is completed. So what would be really nice would be is if we supported the JMSXGroupSeq header (for sequence numbers within a message group) and made it possible to not dispatch any messages within a message group, until the sequence is complete; so they'd stay around in the broker until the sequence can be processed, in one brief amount of time by a single consumer. Also we should support reordering of messages within the sequence as well if the messages get out of order. Then folks could do long term aggregation using purely ActiveMQ in a nice high performance way! Another added benefit would be folks could do a totally asynchronous, loosely coupled and reliable Map/Reduce pattern using purely ActiveMQ. e.g. we can split a single message using the splitter http://activemq.apache.org/camel/splitter.html using the message ID as the JMSXGroupID and for each child message we assign a JMSXGroupSeq until the last message we close the sequence. Then each message can be processed by any consumer in a grid, sending replies back to another queue using the same JMSXGroupID and JMSXGroupSeq. Then when all the messages are received and the sequence is complete, the entire sequence of response messages is sent to a single consumer; who then commits its transaction when the last message in the sequence is processed. All without any explicit persistence - yet the whole thing would be totally transactional and reliable! Not bad eh? As a first stab, it'd be nice to support the 'exclusive message group' idea which seems pretty easy to do; as then at least we'd have an awesome solution for Map/Reduce scenarios which complete within a relatively short amount of time; we'd only need the more advanced 'dispatch when the message group is complete' option for dealing with very long running Map/Reduce problems - which are probably fairly rare. Thoughts? -- James ------- http://macstrac.blogspot.com/ Open Source Integration http://open.iona.com
