Hi, On Wed, Jul 28, 2010 at 06:48:49AM -0400, Rick Day wrote: > I am setting up a two node cluster on RHEL 5.5 with Pacemaker > 1.0.9.1-1. I have a resource set up to start NFS with lsb. I > bring up my first node and everything is fine. All the > resources start up. When I bring up the second node, it appears > that the NFS resource tries to failover and then it just stops. > Why would it even try to failover just because I bring the > second node up? I have another two node cluster set up on > Centos with a slightly different version of pacemaker and it > works fine. Please see configuration and a couple of things > out of the log file below. Please help. > > node SPDLFILE01 \ > attributes standby="off" > node SPDLFILE02 \ > attributes standby="off" > primitive drbd_nfs ocf:heartbeat:drbd \ > params drbd_resource="r0" ignore_deprecation="true" \ > op monitor interval="15s" \ > op start interval="0" timeout="240" \ > op stop interval="0" timeout="100" > primitive fs_nfs ocf:heartbeat:Filesystem \ > params device="/dev/drbd1" directory="/var/nfs" fstype="ext3" \ > op start interval="0" timeout="60" \ > op stop interval="0" timeout="60" > primitive ip_nfs ocf:heartbeat:IPaddr2 \ > params ip="192.168.104.60" cidr_netmask="32" \ > op monitor interval="30s" > primitive nfs lsb:nfs \ > meta target-role="Started" > group nfs_group fs_nfs ip_nfs > ms ms_drbd_nfs drbd_nfs \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" > location cli-standby-nfs nfs \ > rule $id="cli-standby-rule-nfs" -inf: #uname eq SPDLFILE02
Why do you want to prevent nfs running on this node? It won't help on failover. > colocation nfs_on_drbd inf: fs_nfs ms_drbd_nfs:Master > order nfs_after_drbd inf: ms_drbd_nfs:promote fs_nfs:start > property $id="cib-bootstrap-options" \ > dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1280277692" > rsc_defaults $id="rsc-options" \ > resource-stickiness="100" > > > This is what I see in crm_mon when the error occurs....... > > Resource Group: nfs_group > fs_nfs (ocf::heartbeat:Filesystem): Started SPDLFILE01 > ip_nfs (ocf::heartbeat:IPaddr2): Started SPDLFILE01 > > Failed actions: > nfs_monitor_0 (node=SPDLFILE01, call=14, rc=5, status=complete): not > install > ed > > > > Here are some warnings from the log file........ > > lrmd: [8129]: WARN: For LSB init script, no additional parameters are needed. > Jul 27 21:13:26 SPDLFILE01 crmd: [2850]: WARN: status_from_rc: Action 8 > (nfs_monitor_0) on SPDLFILE02 failed (target: 7 vs. rc: 0): Error Was nfs started on boot? > Jul 27 21:13:27 SPDLFILE01 pengine: [2849]: WARN: See > http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information. > Jul 27 21:13:27 SPDLFILE01 pengine: [2849]: WARN: native_create_actions: > Attempting recovery of resource nfs > Jul 27 21:13:27 SPDLFILE01 lrmd: [8366]: WARN: For LSB init script, no > additional parameters are needed. > Jul 27 21:13:27 SPDLFILE01 lrmd: [8399]: WARN: For LSB init script, no > additional parameters are needed. > Jul 27 21:13:27 SPDLFILE01 crmd: [2850]: WARN: status_from_rc: Action 42 > (nfs_start_0) on SPDLFILE01 failed (target: 0 vs. rc: 1): Error nfs failed to start on node 1. The system logs should have a clue. BTW, you should probably use ocf nfsserver RA instead of lsb:nfs. Thanks, Dejan > Jul 27 21:13:27 SPDLFILE01 crmd: [2850]: WARN: update_failcount: Updating > failcount for nfs on SPDLFILE01 after failed start: rc=1 (update=INFINITY, > time=1280279607) > Jul 27 21:13:28 SPDLFILE01 pengine: [2849]: WARN: unpack_rsc_op: Processing > failed op nfs_start_0 on SPDLFILE01: unknown error (1) > Jul 27 21:13:28 SPDLFILE01 pengine: [2849]: WARN: common_apply_stickiness: > Forcing nfs away from SPDLFILE01 after 1000000 failures (max=1000000) > Jul 27 21:13:28 SPDLFILE01 lrmd: [8460]: WARN: For LSB init script, no > additional parameters are needed. > Jul 27 21:13:28 SPDLFILE01 pengine: [2849]: WARN: unpack_rsc_op: Processing > failed op nfs_start_0 on SPDLFILE01: unknown error (1) > Jul 27 21:13:28 SPDLFILE01 pengine: [2849]: WARN: common_apply_stickiness: > Forcing nfs away from SPDLFILE01 after 1000000 failures (max=1000000) > Jul 27 21:28:28 SPDLFILE01 pengine: [2849]: WARN: unpack_rsc_op: Processing > failed op nfs_start_0 on SPDLFILE01: unknown error (1) > Jul 27 21:28:28 SPDLFILE01 pengine: [2849]: WARN: common_apply_stickiness: > Forcing nfs away from SPDLFILE01 after 1000000 failures (max=1000000) > Jul 27 21:32:42 SPDLFILE01 pengine: [2849]: WARN: unpack_rsc_op: Processing > failed op nfs_start_0 on SPDLFILE01: unknown error (1) > Jul 27 21:32:42 SPDLFILE01 pengine: [2849]: WARN: common_apply_stickiness: > Forcing nfs away from SPDLFILE01 after 1000000 failures (max=1000000) > Jul 27 21:32:42 SPDLFILE01 pengine: [2849]: WARN: unpack_rsc_op: Processing > failed op nfs_start_0 on SPDLFILE01: unknown error (1) > Jul 27 21:32:42 SPDLFILE01 pengine: [2849]: WARN: common_apply_stickiness: > Forcing nfs away from SPDLFILE01 after 1000000 failures (max=1000000) > Jul 27 21:41:03 SPDLFILE01 pengine: [2849]: WARN: unpack_rsc_op: Processing > failed op nfs_start_0 on SPDLFILE01: unknown error (1) > Jul 27 21:41:03 SPDLFILE01 pengine: [2849]: WARN: common_apply_stickiness: > Forcing nfs away from SPDLFILE01 after 1000000 failures (max=1000000) > Jul 27 21:41:03 SPDLFILE01 pengine: [2849]: WARN: unpack_rsc_op: Processing > failed op nfs_start_0 on SPDLFILE01: unknown error (1) > > > > > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker