Potential message loss seen with HA topology in Artemis 2.6.2 on failback

Neha Sareen Tue, 17 Jul 2018 14:02:03 -0700

Hi,


We are setting up a cluster of 6 brokers using Artemis 2.6.2.

 

The cluster has 3 groups.

- Each group has one master, and one slave broker pair.

- The HA uses replication.

- Each master broker configuration has the flag 'check-for-live-server' set to 
true.

- Each slave broker configuration has the flag 'allow-failback' set to true.

- We use static connectors for allowing cluster topology discovery.

- Each broker's static connector list includes the connectors to the other 5 
servers in the cluster.

- Each broker declares its acceptor.

- Each broker exports its own connector information via the  'connector-ref' 
configuration element.

- The acceptor and the connector URLs for each broker are identical with 
respect to the host and port information

 

We have a standalone test application that creates producers and 

consumers to write messages and receive messages respectively using a 
transacted JMS session.

 

> We are trying to execute an automatic failover test case followed by failback 
> as follows:

TestCase -1

Step1: Master & Standby Alive

Step2: Producer Send Message , say 9 messages

Step3: Kill Master

Step4: Producer Send Message , say another 9 messages

Step5: Kill Standby

Step6: Start Master 

Step7: Start Standby. 

What we see is that it sync with Master discarding its internal state , and we 
are able to consume only 9 messages, leading to a loss of 9 messages

 

 

Test Case - 2

Step1: Master & Standby Alive

Step2: Producer Send Message 

Step3: Kill Master

Step4: Producer Send Message 

Step5: Kill Standby

Step6: Start Standby ( it waits for Master )

Step7: Start Master (Question does it wait for slave ??)

Step8: Consume Message

 

Can someone provide any insights here regarding the potential message loss?

Also are there alternatives to a different topology we may use here to get 
around this issue?

 

Thanks

Neha

Potential message loss seen with HA topology in Artemis 2.6.2 on failback

Reply via email to