Hello, I have a 2 node cluster with following configuration: **node $id="9e53a111-0dca-496c-9461-a38f3eec4d0e" mcg2 \ attributes standby="off" node $id="a90981f8-d993-4411-89f4-aff7156136d2" mcg1 \ attributes standby="off" primitive ClusterIP ocf:mcg:MCG_VIPaddr_RA \ params ip="192.168.115.50" cidr_netmask="255.255.255.0" nic="bond1.115:1" \ op monitor interval="40" timeout="20" \ meta target-role="Started" primitive EMS ocf:heartbeat:jboss \ params jboss_home="/opt/jboss-5.1.0.GA" java_home="/opt/jdk1.6.0_29/" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="240" \ op monitor interval="30s" timeout="40s" primitive NDB_MGMT ocf:mcg:NDB_MGM_RA \ op monitor interval="120" timeout="120" primitive NDB_VIP ocf:heartbeat:IPaddr2 \ params ip="192.168.117.50" cidr_netmask="255.255.255.255" nic="bond0.117:1" \* * op monitor interval="30" timeout="10" primitive Rmgr ocf:mcg:RM_RA \ op monitor interval="60" role="Master" timeout="30" on-fail="restart" \ op monitor interval="40" role="Slave" timeout="40" on-fail="restart" primitive Tmgr ocf:mcg:TM_RA \ op monitor interval="60" role="Master" timeout="30" on-fail="restart" \ op monitor interval="40" role="Slave" timeout="40" on-fail="restart" primitive mysql ocf:mcg:MYSQLD_RA \ op monitor interval="180" timeout="200" primitive ndbd ocf:mcg:NDBD_RA \ op monitor interval="120" timeout="120" primitive pimd ocf:mcg:PIMD_RA \ op monitor interval="60" role="Master" timeout="30" on-fail="restart" \ op monitor interval="40" role="Slave" timeout="40" on-fail="restart" ms ms_Rmgr Rmgr \ meta master-max="1" master-max-node="1" clone-max="2" clone-node-max="1" interleave="true" notify="true" ms ms_Tmgr Tmgr \ meta master-max="1" master-max-node="1" clone-max="2" clone-node-max="1" interleave="true" notify="true" ms ms_pimd pimd \ meta master-max="1" master-max-node="1" clone-max="2" clone-node-max="1" interleave="true" notify="true" clone EMS_CLONE EMS \ meta globally-unique="false" clone-max="2" clone-node-max="1" target-role="Started" clone mysqld_clone mysql \ meta globally-unique="false" clone-max="2" clone-node-max="1" clone ndbdclone ndbd \ meta globally-unique="false" clone-max="2" clone-node-max="1" target-role="Started" colocation ip_with_Pimd inf: ClusterIP ms_pimd:Master colocation ip_with_RM inf: ClusterIP ms_Rmgr:Master colocation ip_with_TM inf: ClusterIP ms_Tmgr:Master colocation ndb_vip-with-ndb_mgm inf: NDB_MGMT NDB_VIP order RM-after-mysqld inf: mysqld_clone ms_Rmgr order TM-after-RM inf: ms_Rmgr ms_Tmgr order ip-after-pimd inf: ms_pimd ClusterIP order mysqld-after-ndbd inf: ndbdclone mysqld_clone order pimd-after-TM inf: ms_Tmgr ms_pimd property $id="cib-bootstrap-options" \ dc-version="1.0.11-55a5f5be61c367cbd676c2f0ec4f1c62b38223d7" \ cluster-infrastructure="Heartbeat" \ no-quorum-policy="ignore" \ stonith-enabled="false" rsc_defaults $id="rsc-options" \ migration_threshold="3" \ resource-stickiness="100"*
*With both nodes up and running, if heartbeat service is stopped on any of the nodes, following resources are restarted on the other node: mysqld_clone, ms_Rmgr, ms_Tmgr, ms_pimd, ClusterIP >From the Heartbeat debug logs, it seems policy engine is initiating a restart operation for the above resources but the reason for the same is not clear. Following are some excerpts from the logs: "*Feb 07 11:06:31 MCG1 pengine: [20534]: info: determine_online_status: Node mcg2 is shutting down Feb 07 11:06:31 MCG1 pengine: [20534]: info: determine_online_status: Node mcg1 is online Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Master/Slave Set: ms_Rmgr Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Rmgr:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_ac**tive: Resource Rmgr:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource**Rmgr:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Rmgr:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Masters: [ mcg1 ] Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Slaves: [ mcg2 ] Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Master/Slave Set: ms_Tmgr Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Masters: [ mcg1 ] Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Slaves: [ mcg2 ] Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Master/Slave Set: ms_pimd Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource pimd:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource pimd:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource pimd:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource pimd:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Masters: [ mcg1 ] Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Slaves: [ mcg2 ] Feb 07 11:06:31 MCG1 pengine: [20534]: notice: native_print: ClusterIP (ocf::mcg:MCG_VIPaddr_RA): Started mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Clone Set: EMS_CLONE* *Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource EMS:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource EMS:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource EMS:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource EMS:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Started: [ mcg1 mcg2 ] Feb 07 11:06:31 MCG1 pengine: [20534]: notice: native_print: NDB_VIP (ocf::heartbeat:IPaddr2): Started mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: native_print: NDB_MGMT (ocf::mcg:NDB_MGM_RA): Started mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Clone Set: mysqld_clone Feb 07 11:06:31 MCG1 pengine: [20534]: debug: nati**ve_active: Resource mysql:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource mysql:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource mysql:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource mysql:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Started: [ mcg1 mcg2 ] Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Clone Set: ndbdclone Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource ndbd:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource ndbd:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource ndbd:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource ndbd:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Started: [ mcg1 mcg2 ] Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource Rmgr:1: preferring current location (node=mcg2, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource Tmgr:1: preferring current location (node=mcg2, weight=100)* Fe*b 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource pimd:1: preferring current location (node=mcg2, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource EMS:1: preferring current location (node=mcg2, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource mysql:1: preferring current location (node=mcg2, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource ndbd:1: preferring current location (node=mcg2, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource Rmgr:0: preferring current location (node=mcg1, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource Tmgr:0: preferring current location (node=mcg1, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource pimd:0: preferring current location (node=mcg1, weight=100)** Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource ClusterIP: preferring current location (node=mcg1, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource EMS:0: preferring current location (node=mcg1, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource NDB_VIP: preferring current location (node=mcg1, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource NDB_MGMT: preferring current location (node=mcg1, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource mysql:0: preferring current location (node=mcg1, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: common_apply_stickiness: Resource ndbd:0: preferring current location (node=mcg1, weight=100) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to Rmgr:0 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource Rmgr:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for Rmgr:1 Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource Rmgr:1 cannot run anywhere Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 ms_Rmgr instances of a possible 2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: Rmgr:0 master score: 10 Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: Promoting Rmgr:0 (Master mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: Rmgr:1 master score: 0 Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: ms_Rmgr: Promoted 1 instances of a possible 1 to master Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to Tmgr:0 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource Tmgr:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for Tmgr:1 Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource Tmgr:1 cannot run anywhere* *Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 ms_Tmgr instances of a possible 2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: Tmgr:0 master score: 10 Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: Promoting Tmgr:0 (Master mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: Tmgr:1 master score: 0 Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: ms_Tmgr: Promoted 1 instances of a possible 1 to master Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to pimd:0 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource pimd:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for pimd:1 Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource pimd:1 cannot run anywhere* *Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 ms_pimd instances of a possible 2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: pimd:0 master score: 10 Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: Promoting pimd:0 (Master mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_color: pimd:1 master score: 0 Feb 07 11:06:31 MCG1 pengine: [20534]: info: master_color: ms_pimd: Promoted 1 instances of a possible 1 to master Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to ClusterIP Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to EMS:0 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource EMS:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for EMS:1 Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource EMS:1 cannot run anywhere Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 EMS_CLONE instances of a possible 2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to NDB_VIP Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to NDB_MGMT Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to mysql:0 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource mysql:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for mysql:1 Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource mysql:1 cannot run anywhere Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 mysqld_clone instances of a possible 2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Assigning mcg1 to ndbd:0 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: All nodes for resource ndbd:1 are unavailable, unclean or shutting down (mcg2: 0, -1000000) Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for ndbd:1 Feb 07 11:06:31 MCG1 pengine: [20534]: info: native_color: Resource ndbd:1 cannot run anywhere Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_color: Allocated 1 ndbdclone instances of a possible 2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_create_actions: Creating actions for ms_Rmgr Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_create_actions: Creating actions for ms_Tmgr Feb 07 11:06:31 MCG1 pengine: [20534]: debug: master_create_actions: Creating actions for ms_pimd Feb 07 11:06:31 MCG1 pengine: [20534]: info: stage6: Scheduling Node mcg2 for shutdown Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Rmgr:0 with Tmgr:0 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: find_compatible_child: Can't pair Tmgr:1 with ms_Rmgr Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: No match found for Tmgr:1 (0) Feb 07 11:06:31 MCG1 pengine: [20534]: info: clone_rsc_order_lh: Inhibiting Tmgr:1 from being active Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for Tmgr:1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Tmgr:0 with Rmgr:0 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Tmgr:1 with Rmgr:1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Tmgr:0 with pimd:0 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: find_compatible_child: Can't pair pimd:1 with ms_Tmgr Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: No match found for pimd:1 (0) Feb 07 11:06:31 MCG1 pengine: [20534]: info: clone_rsc_order_lh: Inhibiting pimd:1 from being active Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_assign_node: Could not allocate a node for pimd:1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing pimd:0 with Tmgr:0 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing pimd:1 with Tmgr:1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Rmgr:0 with mysql:0 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: clone_rsc_order_lh: Pairing Rmgr:1 with mysql:1 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Restart resource Rmgr:0 (Master mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource Rmgr:1 (mcg2) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Restart resource Tmgr:0 (Master mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource Tmgr:1 (mcg2) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Restart resource pimd:0 (Master mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource pimd:1 (mcg2) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Restart resource ClusterIP (Started mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Leave resource EMS:0 (Started mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource EMS:1 (mcg2) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Leave resource NDB_VIP (Started mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Leave resource NDB_MGMT (Started mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Restart resource mysql:0 (Started mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource mysql:1 (mcg2) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Leave resource ndbd:0 (Started mcg1) Feb 07 11:06:31 MCG1 pengine: [20534]: notice: LogActions: Stop resource ndbd:1 (mcg2) " *Thanks in advance. Regards Neha Chatrath
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org