The default Message Group Map implementation was recently changed to use an
LRU Cache of message groups.

Here's the issue with message groups - the broker does not know the set of
message group IDs ahead of time and must allow for any number of group IDs
to be used.  If the total set of possible message group IDs is small, this
is not a major concern, but if it is large, then tracking message group
owners over time acts like a memory leak (consider the memory needed to
maintain a mapping of 1 million message groups).

The cache implementation attempts to address the concern by limiting the map
to only retain 1024 group ID mappings  (by default - the number appears to
be configurable from looking at the code).  This means that once 1025 group
IDs exist at once within the broker, assignments will get lost (and new
consumers attached to the dropped assignments as-needed, leading to more
dropped assignments, and so on).

On the other hand, the previous default implementation used a Hash Map so
that the hash value of each group ID determined a "bucket" to which the
group was assigned; that bucket can then be assigned to any number of
groups.  The bucket is assigned to a single consumer.  Like the LRU cache,
the number of buckets is limited, thereby eliminating the possibility of a
"pseudo-leak".  However, this leads to the issue that assignments may not be
fair and a single consumer may be assigned any combination of groups
entirely based on the hash of the group IDs.  If selectors are added to the
mix, this easily leads to messages assigned to consumers that cannot consume
the messages.  Yuck.  Add in the max page size limitation and messages start
getting stuck all over the place - double yuck.

The best practice in general is to look for ways to avoid order dependencies
(e.g. attaching sequence numbers to messages so that the processor can
determine when messages are received out-of-order and then suspend
processing until the late messages are received).  Camel's aggregator and/or
resequencer processors can help here.

Using a key such as social security number for message groups is going to be
challenging simply due to the number of groups involved, and the memory leak
concern mentioned above.  If guarantees can be met, such as "no more than
1000 SS numbers will ever have pending messages at a time," then the
concerns can be eliminated.  Probably the hash map solution will be the best
bet here - at the expense of reduced fairness of mappings (one consumer can
easily carry more than its share) and eliminating the feasibility of
selectors (although I usually recommend against using selectors with message
groups anyway).




--
View this message in context: 
http://activemq.2283324.n4.nabble.com/Message-Group-Limitations-how-many-simulataneous-groups-are-supported-tp4706412p4706419.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Reply via email to