[ https://issues.apache.org/jira/browse/ARTEMIS-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Howard Gao closed ARTEMIS-2854. ------------------------------- > Non-durable subscribers may stop receiving after failover > --------------------------------------------------------- > > Key: ARTEMIS-2854 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2854 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker > Affects Versions: 2.14.0 > Reporter: Howard Gao > Assignee: Howard Gao > Priority: Major > Fix For: 2.16.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > In a cluster scenario where non durable subscribers fail over to backup while > another live node forwarding messages to it, there is a chance that the the > live node keeps the old remote binding for the subs and messages go to those > old remote bindings will result in "finding not found". > For example suppose there are 2 live-backup pairs in the cluster: Live1 > backup1 > Live2 and backup2. A non durable subscriber connects to Live1 and messages > are sent to Live2 and then redistributed to the sub on Live1. > Now Live1 crashes and backup1 becomes live. The subscriber fails over to > backup1. > In the mean time Live2 re-connects backup1 too. During the process Live2 > didn't > successfully remove the old remote binding for the subs and it still point to > the > old temp queue's id (which is gone with the Live1 as it's a temp queue). > So the messages (after failover) still are routed to the old queue which is > no longer there. The subscriber will be idle without receiving new messages > from it. > The code concerned this : > https://github.com/apache/activemq-artemis/blob/master/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/cluster/impl/ClusterConnectionImpl.java#L1239 > The code doesn't take care of the case where it's possible that the old > remote binding is still in the map the it's key (clusterName) will be the > same as the new remote binding (which references to a new temp queue) > recreated on fail over. -- This message was sent by Atlassian Jira (v8.3.4#803005)