Hello
I have a replication master/backup setup with static connectors, two nodes
and failback. Both nodes are configured with a network-check-list to a
third node to avoid split brain. I have also set up a reverse proxy in
front of the master/slave artemis because a client of ours cannot be
configured with multiple server addresses. The reverse proxy (haproxy) is
configured with a tcp passthrough and upstreams pointed at the master/slave
acceptors.
When I shut down the master, the backup will promote itself to live. When I
boot the master back up, the slave continues to listen with the acceptor
port and thus my reverse proxy will continue forwarding traffic to it. In
other words failing back is either not working correctly or Artemis
continues to listen with an acceptor port in cases where a failback has
happened.
Could you elaborate a little on what happens when failback should occur?
Should the slave stop its acceptors or not? Is there a way to force the
slave to restart itself when a failback occurrs?
My config (the relevant parts):
node1 (master):
<network-check-list>node2,node3</network-check-list>
<network-check-ping-command>ping -c 1 -t %d
%s</network-check-ping-command>
<ha-policy>
<replication>
<master>
<check-for-live-server>true</check-for-live-server>
</master>
</replication>
</ha-policy>
<cluster-connections>
<cluster-connection name="artemis-cluster">
<connector-ref>netty-connector</connector-ref>
<check-period>1000</check-period>
<connection-ttl>5000</connection-ttl>
<initial-connect-attempts>-1</initial-connect-attempts>
<reconnect-attempts>-1</reconnect-attempts>
<static-connectors>
<connector-ref>netty-backup-connector</connector-ref>
</static-connectors>
</cluster-connection>
</cluster-connections>
<cluster-user>cluster-user</cluster-user>
<cluster-password>password</cluster-password>
<connectors>
<connector name="netty-connector">tcp://node1:61616</connector>
<connector
name="netty-backup-connector">tcp://node2:61616</connector>
</connectors>
<acceptors>
<acceptor name="netty-acceptor">tcp://0.0.0.0:61616</acceptor>
</acceptors>
node2 (slave):
<network-check-list>node1,node3</network-check-list>
<network-check-ping-command>ping -c 1 -t %d
%s</network-check-ping-command>
<ha-policy>
<replication>
<slave>
<allow-failback>true</allow-failback>
<max-saved-replicated-journals-size>0</max-saved-replicated-journals-size>
</slave>
</replication>
</ha-policy>
<cluster-connections>
<cluster-connection name="artemis-cluster">
<connector-ref>netty-connector</connector-ref>
<check-period>1000</check-period>
<connection-ttl>5000</connection-ttl>
<initial-connect-attempts>-1</initial-connect-attempts>
<reconnect-attempts>-1</reconnect-attempts>
<static-connectors>
<connector-ref>netty-connector</connector-ref>
</static-connectors>
</cluster-connection>
</cluster-connections>
<cluster-user>cluster-user</cluster-user>
<cluster-password>password</cluster-password>
<connectors>
<connector name="netty-connector">tcp://node1:61616</connector>
<connector
name="netty-backup-connector">tcp://node2:61616</connector>
</connectors>
<acceptors>
<acceptor name="netty-acceptor">tcp://0.0.0.0:61616</acceptor>
</acceptors>