On 3/13/12 2:49 PM, emmanuel segura wrote: > Sorry Willian > > But i think clvmd it must be used with > > ocf:lvm2:clvmd > > esample > > crm confgiure > primitive clvmd ocf:lvm2:clvmd params daemon_timeout="30" > > clone cln_clvmd clvmd > > and rember clvmd depend on dlm, so for the dlm you sould same
I don't have an ocf:lvm2:clvmd resource on my system. When I do a web search, it looks like a resource found on SUSE systems, but not on RHEL distros. Based on "Clusters From Scratch", I think that if I'm using cman that dlm is started automatically. I see dlm_controld is running without my explicitly starting it: # ps aux | grep dlm_controld root 2495 0.0 0.0 234688 7564 ? Ssl 12:32 0:00 dlm_controld I should have also mentioned that I can duplicate this problem outside pacemaker. That is, I can start cman, clvmd, and gfs2 manually on both nodes, cut off power on one node, and clustering fails on the other node. So I suspect it's not a pacemaker resource problem. For a moment I thought I might not have used "-p lock_dlm" when I created my GFS2 filesystems, but I think the output of "gfs2_edit -p sb ..." shows that I did it correctly: <http://pastebin.com/ALQYpKAy>. When I looked more carefully at my lvm.conf, I saw that I had a typo: fallback_to_local_locking=4 I changed to it the correct value (according to <https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial>): fallback_to_local_locking=0 Unfortunately this doesn't solve the problem. So... any ideas? > Il giorno 13 marzo 2012 17:29, William Seligman <selig...@nevis.columbia.edu >> ha scritto: > >> I'm not sure if this is a "Linux-HA" question; please direct me to the >> appropriate list if it's not. >> >> I'm setting up a two-node cman+pacemaker+gfs2 cluster as described in >> "Clusters From Scratch." Fencing is through forcibly rebooting a node by >> cutting and restoring its power via UPS. >> >> My fencing/failover tests have revealed a problem. If I gracefully turn off >> one node ("crm node standby"; "service pacemaker stop"; "shutdown -r now") >> all the resources transfer to the other node with no problems. If I cut >> power to one node (as would happen if it were fenced), the lsb::clvmd >> resource on the remaining node eventually fails. Since all the other >> resources depend on clvmd, all the resources on the remaining node stop and >> the cluster is left with nothing running. >> >> I've traced why the lsb::clvmd fails: The monitor/status command includes >> "vgdisplay", which hangs indefinitely. Therefore the monitor will always >> time-out. >> >> So this isn't a problem with pacemaker, but with clvmd/dlm: If a node is >> cut off, the cluster isn't handling it properly. Has anyone on this list >> seen this before? Any ideas? >> >> Details: >> >> versions: >> Redhat Linux 6.2 (kernel 2.6.32) >> cman-3.0.12.1 >> corosync-1.4.1 >> pacemaker-1.1.6 >> lvm2-2.02.87 >> lvm2-cluster-2.02.87 >> >> cluster.conf: <http://pastebin.com/w5XNYyAX> >> output of "crm configure show": <http://pastebin.com/atVkXjkn> >> output of "lvm dumpconfig": <http://pastebin.com/rtw8c3Pf> >> >> /var/log/cluster/dlm_controld.log and /var/log/cluster/gfs_controld.log >> show nothing. When I shut down power to one nodes (orestes-tb), the output >> of grep -E "(dlm|gfs2|clvmd)" /var/log/messages is >> <http://pastebin.com/vjpvCFeN>. -- Bill Seligman | Phone: (914) 591-2823 Nevis Labs, Columbia Univ | mailto://selig...@nevis.columbia.edu PO Box 137 | Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems