Hello all, I am trying to understand why the following 2 Corosync heartbeat ring failure scenarios I have been testing and hope somebody can explain why this makes any sense.
Consider the following cluster: * 3x Nodes: A, B and C * 2x NICs for each Node * Corosync 2.3.5 configured with "rrp_mode: passive" and udpu transport with ring id 0 and 1 on each node. * On each node "corosync-cfgtool -s" shows: [...] ring 0 active with no faults [...] ring 1 active with no faults Consider the following scenarios: 1. On node A only block all communication on the first NIC configured with ring id 0 2. On node A only block all communication on all NICs configured with ring id 0 and 1 The result of the above scenarios is as follows: 1. Nodes A, B and C (!) display the following ring status: [...] Marking ringid 0 interface <IP-Address> FAULTY [...] ring 1 active with no faults 2. Node A is shown as OFFLINE - B and C display the following ring status: [...] ring 0 active with no faults [...] ring 1 active with no faults Questions: 1. Is this the expected outcome ? 2. In experiment 1. B and C can still communicate with each other over both NICs, so why are B and C not displaying a "no faults" status for ring id 0 and 1 just like in experiment 2. when node A is completely unreachable ? Regards, Martin Schlegel _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org