[ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

Martin Schlegel Tue, 04 Oct 2016 15:13:09 -0700

Hello all,

I am trying to understand why the following 2 Corosync heartbeat ring failure
scenarios 
I have been testing and hope somebody can explain why this makes any sense.



Consider the following cluster:

    * 3x Nodes: A, B and C
    * 2x NICs for each Node
    * Corosync 2.3.5 configured with "rrp_mode: passive" and 
      udpu transport with ring id 0 and 1 on each node.
    * On each node "corosync-cfgtool -s" shows:
        [...] ring 0 active with no faults
        [...] ring 1 active with no faults


Consider the following scenarios:

    1. On node A only block all communication on the first NIC  configured with
ring id 0
    2. On node A only block all communication on all       NICs configured with
ring id 0 and 1


The result of the above scenarios is as follows:

    1. Nodes A, B and C (!) display the following ring status:
        [...] Marking ringid 0 interface <IP-Address> FAULTY
        [...] ring 1 active with no faults
    2. Node A is shown as OFFLINE - B and C display the following ring status:
        [...] ring 0 active with no faults
        [...] ring 1 active with no faults


Questions:
    1. Is this the expected outcome ?
    2. In experiment 1. B and C can still communicate with each other over both
NICs, so why are 
       B and C not displaying a "no faults" status for ring id 0 and 1 just like
in experiment 2. 
       when node A is completely unreachable ?


Regards,
Martin Schlegel

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

Reply via email to