Hi Lewis, the `quorum-vote-wait` parameter only affects nodes that are acting as backup. It defines the time that the backup nodes will wait for quorum vote responses and not time to wait before sending a quorum vote request. So this parameter is not useful to allow Backup-1 to participate in the quorum vote.
Anyway, I would not keep the Active-3 live without the quorum to avoid split-brains. ARTEMIS-2716[1] should address your use case. [1] https://issues.apache.org/jira/browse/ARTEMIS-2716 Regards, Domenico On Mon, 12 Jul 2021 at 06:10, Lewis Gardner <lewisgard...@gmail.com> wrote: > I have a 3-active/backup pair HA setup with each pair on a separate network > segment. > > Seg 1: Active-1 and Backup-3 (backup for Active-3) > Seg 2: Active-2 and Backup-1 (backup for Active-1) > Seg 3: Active-3 and Backup-2 (backup for Active-2) > > I am using the "vote-on-replication-failure = true" option to automatically > shutdown active nodes which have been network isolated. > > If I disconnect network segment 1, Backup-1 on segment 2 properly announces > itself as Live. Active-3 however attempts to get quorum votes from both > Active-1 and Active-2, does not receive a reply from Active-1 (as that one > is on the same failed network segment as Backup-3) and shuts itself down > after 5 seconds with "Timeout waiting for quorum vote responses" > > I have tried increasing the timeout to allow Backup-1 to complete becoming > Live and participating in Active-3's quorum request but Active-3 always > prints "Waiting 5 seconds for quorum vote results", independently of what > value I specify in the "quorum-vote-wait" option. > > The Active-3 configuration is shown below: > > <connectors> > <connector name="netty-active-1">tcp:// > 192.168.2.20:61616?sslEnabled=true</connector> > <connector name="netty-active-2">tcp:// > 192.168.2.21:61616?sslEnabled=true</connector> > <connector name="netty-active-3">tcp:// > 192.168.2.22:61616?sslEnabled=true</connector> > <connector name="netty-backup-1">tcp:// > 192.168.2.20:61716?sslEnabled=true</connector> > <connector name="netty-backup-2">tcp:// > 192.168.2.21:61716?sslEnabled=true</connector> > <connector name="netty-backup-3">tcp:// > 192.168.2.22:61716?sslEnabled=true</connector> > </connectors> > > <cluster-connections> > <cluster-connection name="my-cluster"> > <connector-ref>netty-active-3</connector-ref> > <check-period>1000</check-period> > <connection-ttl>5000</connection-ttl> > <call-timeout>5000</call-timeout> > <retry-interval>500</retry-interval> > <retry-interval-multiplier>1.0</retry-interval-multiplier> > <max-retry-interval>5000</max-retry-interval> > <initial-connect-attempts>-1</initial-connect-attempts> > <reconnect-attempts>-1</reconnect-attempts> > <use-duplicate-detection>true</use-duplicate-detection> > <message-load-balancing>ON_DEMAND</message-load-balancing> > <max-hops>1</max-hops> > <notification-interval>1000</notification-interval> > <notification-attempts>2</notification-attempts> > <static-connectors> > <connector-ref>netty-active-2</connector-ref> > <connector-ref>netty-active-3</connector-ref> > <connector-ref>netty-backup-1</connector-ref> > <connector-ref>netty-backup-2</connector-ref> > <connector-ref>netty-backup-3</connector-ref> > </static-connectors> > </cluster-connection> > </cluster-connections> > > <ha-policy> > <replication> > <master> > > <vote-on-replication-failure>true</vote-on-replication-failure> > <quorum-vote-wait>12</quorum-vote-wait> > <check-for-live-server>true</check-for-live-server> > <group-name>server3</group-name> > </master> > </replication> > </ha-policy> > > How can I make Active-3 wait for Backup-1 to become live before shutting > down? > > regards, > Lewis >