Hi,
We are setting up a cluster of 6 brokers using Artemis 2.6.2. The cluster has 3 groups. - Each group has one master, and one slave broker pair. - The HA uses replication. - Each master broker configuration has the flag 'check-for-live-server' set to true. - Each slave broker configuration has the flag 'allow-failback' set to true. - We use static connectors for allowing cluster topology discovery. - Each broker's static connector list includes the connectors to the other 5 servers in the cluster. - Each broker declares its acceptor. - Each broker exports its own connector information via the 'connector-ref' configuration element. - The acceptor and the connector URLs for each broker are identical with respect to the host and port information We have a standalone test application that creates producers and consumers to write messages and receive messages respectively using a transacted JMS session. > We are trying to execute an automatic failover test case followed by failback > as follows: TestCase -1 Step1: Master & Standby Alive Step2: Producer Send Message , say 9 messages Step3: Kill Master Step4: Producer Send Message , say another 9 messages Step5: Kill Standby Step6: Start Master Step7: Start Standby. What we see is that it sync with Master discarding its internal state , and we are able to consume only 9 messages, leading to a loss of 9 messages Test Case - 2 Step1: Master & Standby Alive Step2: Producer Send Message Step3: Kill Master Step4: Producer Send Message Step5: Kill Standby Step6: Start Standby ( it waits for Master ) Step7: Start Master (Question does it wait for slave ??) Step8: Consume Message Can someone provide any insights here regarding the potential message loss? Also are there alternatives to a different topology we may use here to get around this issue? Thanks Neha