The pithy ruminations from "Fabio M. Di Nitto" <fdini...@redhat.com> on "Re: 
[Linux-cluster] quorum device not getting a vote causes 2-node cluster to be 
inquorate" were:

=> On 03/15/2011 05:11 AM, berg...@merctech.com wrote:
=> > I have been using a 2-node cluster with a quorum disk successfully for
=> > about 2 years. Beginning today, the cluster will not boot correctly.
=> > 
=> > The RHCS services start, but fencing fails with:
=> >    
=> >    dlm: no local IP address has been set
=> >    dlm: cannot start dlm lowcomms -107
=> > 
=> > This seems to be a symtpom of the fact that the cluster votes do not 
include votes from the quorum
=> > device:
=> > 
=> >    # clustat
=> >    Cluster Status for example-infra @ Tue Mar 15 00:02:35 2011
=> >    Member Status: Inquorate
=> > 
=> >    Member Name                                              ID   Status
=> >    ------ ----                                              ---- ------
=> >    example-infr2-admin.domain.com                              1 Online, 
Local
=> >    example-infr1-admin.domain.com                              2 Offline
=> >         /dev/mpath/quorum                                           0 
Offline
=> > 
=> >    [root@example-infr2 ~]# cman_tool status
=> >    Version: 6.2.0
=> >    Config Version: 239
=> >    Cluster Name: example-infra
=> >    Cluster Id: 42813
=> >    Cluster Member: Yes
=> >    Cluster Generation: 676844
=> >    Membership state: Cluster-Member
=> >    Nodes: 1
=> >    Expected votes: 2
=> >    Total votes: 1
=> >    Quorum: 2 Activity blocked
=> >    Active subsystems: 7
=> >    Flags: 
=> >    Ports Bound: 0  
=> >    Node name: example-infr2-admin.domain.com
=> >    Node ID: 1
=> >    Multicast addresses: 239.192.167.228 
=> >    Node addresses: 192.168.110.3 
=> 
=> You should check the output from cman_tool nodes. It appears that the
=> nodes are not seeing each other at all.

That's correct...at the time I ran cman_tool and clustat, one node was down 
(deliberately, in an attempt to troubleshoot the issue, but this would also be 
the case in the event of a hardware failure).

As I see it, the problem is not with the inter-node communication, but with the 
quorum device. Note that there is only one vote registered--there are no votes 
from the quorum device. The quorum device should provide sufficient votes to 
make the "cluster" quorate if only one node is running.

If I understand it correctly, this should also let the "cluster" start with a 
single node (as long as that node can write to the quorum device). If my 
understanding is wrong, then how can a 2-node cluster start if one node is down?

=> 
=> The first things I would check are iptables, node names resolves to the
=> correct ip addresses, selinux and eventually if the switch in between
=> the nodes support multicast.

SElinux is disabled (as it has been for the 2 years this cluster has been 
operational).

There have been no switch changes.

Node names & IPs resolve correctly.

IPtables permits all communication between the "admin" address on the servers.

=> 
=> Fabio
=> 
=> --
=> Linux-cluster mailing list
=> Linux-cluster@redhat.com
=> https://www.redhat.com/mailman/listinfo/linux-cluster
=> 

Thanks,

Mark

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to