[ https://issues.apache.org/jira/browse/ARTEMIS-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17723532#comment-17723532 ]
Justin Bertram commented on ARTEMIS-4276: ----------------------------------------- bq. We are using virtual topics for that. Now that you're on ActiveMQ Artemis you can use JMS 2's [shared topic consumer|https://docs.oracle.com/javaee/7/api/javax/jms/Session.html#createSharedConsumer-javax.jms.Topic-java.lang.String-]. bq. By using grouping we ensure that the same consumer will process all versions of the same transaction. As noted previously, grouping *doesn't* ensure that the *same* consumer will process all the messages in the group. It only guarantees that _one consumer at a time_ will process the messages in the group and therefore the messages will be processed in order. bq. To handle a message duplication all our consumer's listeners are using a LRU (last recently used) cache of the already processed messages. A local, volatile LRU cache is not enough to mitigate duplicate messages. Keep in mind that even _if_ the broker maintained the consumer-group relationship during broker failover the consumer itself can still fail at any point (e.g. JVM crash, hardware failure, network glitch, etc.) at which time a new consumer for the group will be chosen which may lead to processing duplicate messages since the _new_ consumer won't have the already-processed messages in its LRU cache. In short, guaranteeing that the same consumer gets the same group on broker failover does not adequately deal with the threat of duplicate messages. Generally speaking, distributing state like this (i.e. in the consumer's LRU cache) is not a good idea because it typically leads to consistency issues. State should be concentrated in the non-distributed components (i.e. message broker & database). bq. Is the grouping cached used by the broker distributed or persisted during te failover switch? No. The consumer-group relationship is not designed to survive fail-over for the reasons I outlined previously. bq. Is there any setup to circumvent this? Yes. Simply put, your consumers need to be [_idempotent_|https://en.wikipedia.org/wiki/Idempotence]. In your situation I can think of a few ways to do this. Often when folks needs keep data between two resources like a message broker and a database in sync they use an [XA transaction|https://en.wikipedia.org/wiki/X/Open_XA]. In Java this is implemented via [JTA|https://github.com/jakartaee/transactions]. This is very common in Java especially when an application is running in a Java EE application server because MDBs are transactional by default and any other XA resource used in the course of processing a JMS message in an MDB is automatically enlisted into the transaction meaning that all the work is _atomically_ (i.e. either it all succeeds or it all fails). By using a JTA transaction between the JMS and JDBC resources you ensure that if the JDBC insert succeeds but the JMS message acknowledgement fails then everything will be rolled back so that neither the JMS message is consumed nor the data is actually inserted into the JDBC database. When the message is consumed again later there will be no duplicate entries in the database. Another way to deal with this would be to set up a primary key on the table (or tables) where you're inserting data. This would prevent duplicates records from being inserted into the database when consumers receive duplicate messages. The primary key could be a combination of the {{JMSXGroupID}} and the version (e.g. {{EXT_BOND_ID_4}}). Therefore, in the scenario you outlined in your comment when *LDR1* receives *EXT_BOND_ID* with version *4* it will process it and when it tries to insert it into the database it won't actually be able to. > Message Group does not replicate properly during failover > --------------------------------------------------------- > > Key: ARTEMIS-4276 > URL: https://issues.apache.org/jira/browse/ARTEMIS-4276 > Project: ActiveMQ Artemis > Issue Type: Bug > Affects Versions: 2.28.0 > Reporter: Liviu Citu > Priority: Major > > Hi, > We are currently migrating our software from Classic to Artemis and we plan > to use failover functionality. > We were using message group functionality by setting *JMSXGroupID* and this > was working as expected. However after failover switch I noticed that > messages are sent to wrong consumers. > Our gateway/interface application is actually a collection of servers: > * gateway adapter server: receives messages from an external systems and > puts them on a specific/virtual topic > * gateway loader server (can be balanced): picks up the messages from the > topic and do processing > * gateway fail queue: monitors all messages that failed processing and has a > functionality of resubmitting the message (users will correct the processing > errors and then resubmit transaction) > *JMSXGroupID* is used to ensure that during message resubmit the same > consumer/loader is processing the message as it was originally processed. > However, if the message resubmit is happening during failover switch we have > noticed that the message is not sent to the right consumer as it should. > Basically the first available consumer is used which is not what we want. > I have searched for configuration changes but couldn't find any relevant > information. -- This message was sent by Atlassian Jira (v8.20.10#820010)