Hello, using pacemaker 1.0.8 on rh el 5 I have some problems understanding the way ping clone works to setup monitoring of gw... even after reading docs...
As soon as I run: crm configure location nfs-group-with-pinggw nfs-group rule -inf: not_defined pinggw or pinggw lte 0 the resources go stopped and don't re-start.... Then, as soon as I run crm configure delete nfs-group-with-pinggw the resources of the group start again... config (part of it, actually) I try to apply is this: group nfs-group ClusterIP lv_drbd0 NfsFS nfssrv \ meta target-role="Started" ms NfsData nfsdrbd \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" primitive pinggw ocf:pacemaker:ping \ params host_list="192.168.101.1" multiplier="100" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" clone cl-pinggw pinggw \ meta globally-unique="false" location nfs-group-with-pinggw nfs-group \ rule $id="nfs-group-with-pinggw-rule" -inf: not_defined pinggw or pinggw lte 0 Is the location constraint to be done with ping resource or with its clone? Is it a cause of the problem that I have also defined an nfs client on the other node with: primitive nfsclient ocf:heartbeat:Filesystem \ params device="nfsha:/nfsdata/web" directory="/nfsdata/web" fstype="nfs" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="60" colocation nfsclient_not_on_nfs-group -inf: nfs-group nfsclient order nfsclient_after_nfs-group inf: nfs-group nfsclient Thansk in advance, Gianluca >From messages of the server running the nfs-group at that moment: May 10 15:18:27 ha1 cibadmin: [29478]: info: Invoked: cibadmin -Ql May 10 15:18:27 ha1 cibadmin: [29479]: info: Invoked: cibadmin -Ql May 10 15:18:28 ha1 crm_shadow: [29536]: info: Invoked: crm_shadow -c __crmshell.29455 May 10 15:18:28 ha1 cibadmin: [29537]: info: Invoked: cibadmin -p -U May 10 15:18:28 ha1 crm_shadow: [29539]: info: Invoked: crm_shadow -C __crmshell.29455 --force May 10 15:18:28 ha1 cib: [8470]: info: cib_replace_notify: Replaced: 0.267.14 -> 0.269.1 from <null> May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: - <cib epoch="267" num_updates="14" admin_epoch="0" /> May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + <cib epoch="269" num_updates="1" admin_epoch="0" > May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + <configuration > May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + <constraints > May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + <rsc_location id="nfs-group-with-pinggw" rsc="nfs-group" __crm_diff_marker__="added:top" > May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + <rule boolean-op="or" id="nfs-group-with-pinggw-rule" score="-INFINITY" > May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + <expression attribute="pinggw" id="nfs-group-with-pinggw-expression" operation="not_defined" /> May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + <expression attribute="pinggw" id="nfs-group-with-pinggw-expression-0" operation="lte" value="0" /> May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + </rule> May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + </rsc_location> May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + </constraints> May 10 15:18:28 ha1 crmd: [8474]: info: abort_transition_graph: need_abort:59 - Triggered transition abort (complete=1) : Non-status change May 10 15:18:28 ha1 attrd: [8472]: info: do_cib_replaced: Sending full refresh May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + </configuration> May 10 15:18:28 ha1 crmd: [8474]: info: need_abort: Aborting on change to epoch May 10 15:18:28 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: master-nfsdrbd:0 (10000) May 10 15:18:28 ha1 cib: [8470]: info: log_data_element: cib:diff: + </cib> May 10 15:18:28 ha1 crmd: [8474]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] May 10 15:18:28 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_replace for section 'all' (origin=local/crm_shadow/2, version=0.269.1): ok (rc=0) May 10 15:18:28 ha1 crmd: [8474]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. May 10 15:18:28 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/203, version=0.269.1): ok (rc=0) May 10 15:18:28 ha1 crmd: [8474]: info: do_pe_invoke: Query 205: Requesting the current CIB: S_POLICY_ENGINE May 10 15:18:28 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) May 10 15:18:28 ha1 cib: [29541]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-47.raw May 10 15:18:28 ha1 crmd: [8474]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_ELECTION [ input=I_ELECTION cause=C_FSA_INTERNAL origin=do_cib_replaced ] May 10 15:18:28 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: terminate (<null>) May 10 15:18:28 ha1 cib: [29541]: info: write_cib_contents: Wrote version 0.269.0 of the CIB to disk (digest: 8f92c20ff8f96cde0fa0c75cd3207caa) May 10 15:18:28 ha1 crmd: [8474]: info: update_dc: Unset DC ha1 May 10 15:18:28 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: master-nfsdrbd:1 (<null>) May 10 15:18:28 ha1 cib: [29541]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.FPnpLz (digest: /var/lib/heartbeat/crm/cib.EsRWbp) May 10 15:18:28 ha1 crmd: [8474]: info: do_state_transition: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=do_election_check ] May 10 15:18:28 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: shutdown (<null>) May 10 15:18:28 ha1 crmd: [8474]: info: do_dc_takeover: Taking over DC status for this partition May 10 15:18:28 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: pingd (100) May 10 15:18:28 ha1 cib: [8470]: info: cib_process_readwrite: We are now in R/O mode May 10 15:18:28 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_slave_all for section 'all' (origin=local/crmd/206, version=0.269.1): ok (rc=0) May 10 15:18:28 ha1 cib: [8470]: info: cib_process_readwrite: We are now in R/W mode May 10 15:18:28 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_master for section 'all' (origin=local/crmd/207, version=0.269.1): ok (rc=0) May 10 15:18:28 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/208, version=0.269.1): ok (rc=0) May 10 15:18:28 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/210, version=0.269.1): ok (rc=0) May 10 15:18:28 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/212, version=0.269.1): ok (rc=0) May 10 15:18:28 ha1 crmd: [8474]: info: do_dc_join_offer_all: join-6: Waiting on 2 outstanding join acks May 10 15:18:28 ha1 crmd: [8474]: info: ais_dispatch: Membership 180: quorum retained May 10 15:18:28 ha1 crmd: [8474]: info: crm_ais_dispatch: Setting expected votes to 2 May 10 15:18:28 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/215, version=0.269.1): ok (rc=0) May 10 15:18:28 ha1 crmd: [8474]: info: config_query_callback: Checking for expired actions every 900000ms May 10 15:18:28 ha1 crmd: [8474]: info: config_query_callback: Sending expected-votes=2 to corosync May 10 15:18:28 ha1 crmd: [8474]: info: update_dc: Set DC to ha1 (3.0.1) May 10 15:18:28 ha1 crmd: [8474]: info: ais_dispatch: Membership 180: quorum retained May 10 15:18:28 ha1 crm_shadow: [29542]: info: Invoked: crm_shadow -D __crmshell.29455 --force May 10 15:18:28 ha1 crmd: [8474]: info: crm_ais_dispatch: Setting expected votes to 2 May 10 15:18:28 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/218, version=0.269.1): ok (rc=0) May 10 15:18:28 ha1 crmd: [8474]: info: do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ] May 10 15:18:28 ha1 crmd: [8474]: info: do_state_transition: All 2 cluster nodes responded to the join offer. May 10 15:18:28 ha1 crmd: [8474]: info: do_dc_join_finalize: join-6: Syncing the CIB from ha1 to the rest of the cluster May 10 15:18:28 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_sync for section 'all' (origin=local/crmd/219, version=0.269.1): ok (rc=0) May 10 15:18:28 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/220, version=0.269.1): ok (rc=0) May 10 15:18:29 ha1 crmd: [8474]: info: do_dc_join_ack: join-6: Updating node state to member for ha2 May 10 15:18:29 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/221, version=0.269.1): ok (rc=0) May 10 15:18:29 ha1 crmd: [8474]: info: do_dc_join_ack: join-6: Updating node state to member for ha1 May 10 15:18:29 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_delete for section //node_sta...@uname='ha2']/lrm (origin=local/crmd/222, version=0.269.2): ok (rc=0) May 10 15:18:29 ha1 crmd: [8474]: info: erase_xpath_callback: Deletion of "//node_sta...@uname='ha2']/lrm": ok (rc=0) May 10 15:18:29 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_delete for section //node_sta...@uname='ha1']/lrm (origin=local/crmd/224, version=0.269.4): ok (rc=0) May 10 15:18:29 ha1 crmd: [8474]: info: do_state_transition: State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ] May 10 15:18:29 ha1 crmd: [8474]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. May 10 15:18:29 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/226, version=0.269.5): ok (rc=0) May 10 15:18:29 ha1 crmd: [8474]: info: do_dc_join_final: Ensuring DC, quorum and node attributes are up-to-date May 10 15:18:29 ha1 crmd: [8474]: info: crm_update_quorum: Updating quorum status to true (call=228) May 10 15:18:29 ha1 attrd: [8472]: info: attrd_local_callback: Sending full refresh (origin=crmd) May 10 15:18:29 ha1 cib: [8470]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/228, version=0.269.5): ok (rc=0) May 10 15:18:29 ha1 crmd: [8474]: info: abort_transition_graph: do_te_invoke:191 - Triggered transition abort (complete=1) : Peer Cancelled May 10 15:18:29 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: master-nfsdrbd:0 (10000) May 10 15:18:29 ha1 crmd: [8474]: info: do_pe_invoke: Query 229: Requesting the current CIB: S_POLICY_ENGINE May 10 15:18:29 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) May 10 15:18:29 ha1 crmd: [8474]: info: erase_xpath_callback: Deletion of "//node_sta...@uname='ha1']/lrm": ok (rc=0) May 10 15:18:29 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: terminate (<null>) May 10 15:18:29 ha1 crmd: [8474]: info: te_update_diff: Detected LRM refresh - 8 resources updated: Skipping all resource events May 10 15:18:29 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: master-nfsdrbd:1 (<null>) May 10 15:18:29 ha1 crmd: [8474]: info: abort_transition_graph: te_update_diff:227 - Triggered transition abort (complete=1, tag=diff, id=(null), magic=NA, cib=0.269.5) : LRM Refresh May 10 15:18:29 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: shutdown (<null>) May 10 15:18:29 ha1 crmd: [8474]: info: do_pe_invoke_callback: Invoking the PE: query=229, ref=pe_calc-dc-1273497509-143, seq=180, quorate=1 May 10 15:18:29 ha1 pengine: [8473]: notice: unpack_config: On loss of CCM Quorum: Ignore May 10 15:18:29 ha1 attrd: [8472]: info: attrd_trigger_update: Sending flush op to all hosts for: pingd (100) May 10 15:18:29 ha1 crmd: [8474]: info: do_pe_invoke: Query 230: Requesting the current CIB: S_POLICY_ENGINE May 10 15:18:29 ha1 pengine: [8473]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 May 10 15:18:29 ha1 crmd: [8474]: info: do_pe_invoke_callback: Invoking the PE: query=230, ref=pe_calc-dc-1273497509-144, seq=180, quorate=1 May 10 15:18:29 ha1 pengine: [8473]: info: determine_online_status: Node ha1 is online May 10 15:18:29 ha1 pengine: [8473]: notice: unpack_rsc_op: Operation nfsdrbd:0_monitor_0 found resource nfsdrbd:0 active in master mode on ha1 May 10 15:18:29 ha1 pengine: [8473]: info: determine_online_status: Node ha2 is online May 10 15:18:29 ha1 pengine: [8473]: notice: native_print: SitoWeb (ocf::heartbeat:apache): Started ha1 May 10 15:18:29 ha1 pengine: [8473]: notice: clone_print: Master/Slave Set: NfsData May 10 15:18:29 ha1 pengine: [8473]: notice: short_print: Masters: [ ha1 ] May 10 15:18:29 ha1 pengine: [8473]: notice: short_print: Slaves: [ ha2 ] May 10 15:18:29 ha1 pengine: [8473]: notice: group_print: Resource Group: nfs-group May 10 15:18:29 ha1 pengine: [8473]: notice: native_print: ClusterIP (ocf::heartbeat:IPaddr2): Started ha1 May 10 15:18:29 ha1 pengine: [8473]: notice: native_print: lv_drbd0 (ocf::heartbeat:LVM): Started ha1 May 10 15:18:29 ha1 pengine: [8473]: notice: native_print: NfsFS (ocf::heartbeat:Filesystem): Started ha1 May 10 15:18:29 ha1 pengine: [8473]: notice: native_print: nfssrv (ocf::heartbeat:nfsserver): Started ha1 May 10 15:18:29 ha1 cibadmin: [29543]: info: Invoked: cibadmin -Ql May 10 15:18:29 ha1 pengine: [8473]: notice: native_print: nfsclient (ocf::heartbeat:Filesystem): Started ha2 May 10 15:18:29 ha1 pengine: [8473]: notice: clone_print: Clone Set: cl-pinggw May 10 15:18:29 ha1 pengine: [8473]: notice: short_print: Started: [ ha1 ha2 ] May 10 15:18:29 ha1 pengine: [8473]: info: native_merge_weights: NfsData: Rolling back scores from ClusterIP May 10 15:18:29 ha1 pengine: [8473]: info: native_merge_weights: NfsData: Rolling back scores from ClusterIP May 10 15:18:29 ha1 pengine: [8473]: info: master_color: Promoting nfsdrbd:0 (Master ha1) May 10 15:18:29 ha1 pengine: [8473]: info: master_color: NfsData: Promoted 1 instances of a possible 1 to master May 10 15:18:29 ha1 pengine: [8473]: info: native_merge_weights: nfsclient: Rolling back scores from ClusterIP May 10 15:18:29 ha1 pengine: [8473]: info: native_merge_weights: nfsclient: Rolling back scores from lv_drbd0 May 10 15:18:29 ha1 pengine: [8473]: info: native_merge_weights: nfsclient: Rolling back scores from NfsFS May 10 15:18:29 ha1 pengine: [8473]: info: native_merge_weights: nfsclient: Rolling back scores from ClusterIP May 10 15:18:29 ha1 pengine: [8473]: info: native_merge_weights: ClusterIP: Rolling back scores from lv_drbd0 May 10 15:18:29 ha1 pengine: [8473]: info: native_merge_weights: ClusterIP: Rolling back scores from SitoWeb May 10 15:18:29 ha1 pengine: [8473]: WARN: native_color: Resource ClusterIP cannot run anywhere May 10 15:18:29 ha1 pengine: [8473]: info: native_merge_weights: lv_drbd0: Rolling back scores from NfsFS May 10 15:18:29 ha1 pengine: [8473]: WARN: native_color: Resource lv_drbd0 cannot run anywhere May 10 15:18:29 ha1 pengine: [8473]: info: native_merge_weights: NfsFS: Rolling back scores from nfssrv May 10 15:18:29 ha1 pengine: [8473]: WARN: native_color: Resource NfsFS cannot run anywhere May 10 15:18:29 ha1 pengine: [8473]: WARN: native_color: Resource nfssrv cannot run anywhere May 10 15:18:29 ha1 pengine: [8473]: WARN: native_color: Resource SitoWeb cannot run anywhere May 10 15:18:29 ha1 pengine: [8473]: info: master_color: Promoting nfsdrbd:0 (Master ha1) May 10 15:18:29 ha1 pengine: [8473]: info: master_color: NfsData: Promoted 1 instances of a possible 1 to master May 10 15:18:29 ha1 pengine: [8473]: notice: LogActions: Stop resource SitoWeb (ha1) May 10 15:18:29 ha1 pengine: [8473]: notice: LogActions: Leave resource nfsdrbd:0 (Master ha1) May 10 15:18:29 ha1 pengine: [8473]: notice: LogActions: Leave resource nfsdrbd:1 (Slave ha2) May 10 15:18:29 ha1 pengine: [8473]: notice: LogActions: Stop resource ClusterIP (ha1) May 10 15:18:29 ha1 pengine: [8473]: notice: LogActions: Stop resource lv_drbd0 (ha1) May 10 15:18:29 ha1 pengine: [8473]: notice: LogActions: Stop resource NfsFS (ha1) May 10 15:18:29 ha1 pengine: [8473]: notice: LogActions: Stop resource nfssrv (ha1) May 10 15:18:29 ha1 pengine: [8473]: notice: LogActions: Stop resource nfsclient (Started ha2) May 10 15:18:29 ha1 pengine: [8473]: notice: LogActions: Leave resource pinggw:0 (Started ha1) May 10 15:18:29 ha1 pengine: [8473]: notice: LogActions: Leave resource pinggw:1 (Started ha2)
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf