[ 
https://issues.apache.org/jira/browse/ARTEMIS-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074493#comment-17074493
 ] 

Francesco Nigro commented on ARTEMIS-2690:
------------------------------------------

Are you using 4 pairs? If is that so the quorum vote cannot work at its best of 
you have network partition-like issues because you cannot get the majority 
using an even number of nodes in the cluster...
Anyway the method used to detect if other nodes use the same Id on 
https://github.com/apache/activemq-artemis/blob/e82d95fff640e697f5104a325e66e40bd2b1c69b/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/impl/SharedNothingLiveActivation.java#L322
 should make uses of the quorum vote in case no other node is detected, but I 
am not sure if it would improve your case.

> Intermittent network failure caused live and replica to both be live
> --------------------------------------------------------------------
>
>                 Key: ARTEMIS-2690
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2690
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>    Affects Versions: 2.11.0
>         Environment: Artemis 2.11.0, Ubuntu 18.04
>            Reporter: Sebastian Lövdahl
>            Priority: Major
>         Attachments: live1-artemis.log, live1-broker.xml, live2-artemis.log, 
> live2-broker.xml, live3-artemis.log, live3-broker.xml, replica1-artemis.log, 
> replica1-broker.xml
>
>
> An intermittent network failure caused both the live and replica to be live. 
> Both happily accepted incoming connections until the node that was supposed 
> to be the replica was manually shut down. Log files from all 4 nodes are 
> attached. The {{replica1}} node happened to have some TRACE logging enabled 
> as well.
>  
> As far as I have understood the documentation, the setup should be safe from 
> a split brain point of view. The live2 and live3 nodes intentionally don't 
> have any replicas at the moment. Complete {{broker.xml}} files are attached, 
> but for reference, this is the {{ha-policy}}:
> live1:
> {code:xml}
> <ha-policy>
>   <replication>
>     <master>
>       <cluster-name>my-cluster</cluster-name>
>       <group-n ame>group1</group-name>
>       <check-for-live-server>true</check-for-live-server>
>       <vote-on-replication-failure>true</vote-on-replication-failure>
>     </master>
>   </replication>
> </ha-policy>
> {code}
> replica1:
> {code:xml}
> <ha-policy>
>   <replication>
>     <slave>
>        <cluster-name>my-cluster</cluster-name>
>        <group-name>group1</group-name>
>        <allow-failback>true</allow-failback>
>        <vote-on-replication-failure>true</vote-on-replication-failure>
>     </slave>
>   </replication>
> </ha-policy>
> {code}
> live2:
> {code:xml}
> <ha-policy>
>   <replication>
>     <master>
>        <cluster-name>my-cluster</cluster-name>
>        <group-name>group2</group-name>
>        <check-for-live-server>true</check-for-live-server>
>        <vote-on-replication-failure>true</vote-on-replication-failure>
>     </master>
>   </replication>
> </ha-policy>
> {code}
> live3:
> {code:xml}
> <ha-policy>
>   <replication>
>     <master>
>        <cluster-name>my-cluster</cluster-name>
>        <group-name>group2</group-name>
>        <check-for-live-server>true</check-for-live-server>
>        <vote-on-replication-failure>true</vote-on-replication-failure>
>     </master>
>   </replication>
> </ha-policy>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to