Hi All, We are using Pacemaker 1.1.6 on Ubuntu Precise. We have two nodes running KVM as hypervisor & have one VM each on each node (vm-0/vm-1) that are paired in a HA cluster using heartbeat/pacemaker. We have redudant network links between the VM's & also have setup our own STONITH resource that interacts with the underlying node hypervisor to reboot VM's.
We ran into an issue where the underlying node crashed badly, but the networking & the VM were running ok for some more period of time, but with some kind of slowness. So initially heartbeat on the healthy VM detected that the peer VM died & started recovery action with a STONITH, but suddenly when peer VM heartbeats started to arrive, it tried to abort stonith recovery & also unset the current VM as DC. But the STONITH thread was progressing independently & managed to reboot the peer VM which was later recognized by further Link Down events. But this VM didnt takeover DC after this & remained with a single VM without DC for a long period till we identified/rebooted the crashed node. Below are the full logs. Any pointers/insights on what fix should be done within heartbeat/pacemaker is greatly appreciated. Thanks. (from logs) VM-1 recognizes that VM-0 went away with Link Down events & starts recovery action Nov 9 16:58:40 vm-1 heartbeat: [2378]: WARN: node vm-0: is dead Nov 9 16:58:40 vm-1 heartbeat: [2378]: info: Link vm-0:eth0 dead. Nov 9 16:58:40 vm-1 heartbeat: [2378]: info: Link vm-0:eth2 dead. Nov 9 16:58:40 vm-1 crmd: [2478]: notice: crmd_ha_status_callback: Status update: Node vm-0 now has status [dead] Nov 9 16:58:40 vm-1 cib: [2474]: info: mem_handle_event: Got an event OC_EV_MS_NOT_PRIMARY from ccm Nov 9 16:58:40 vm-1 cib: [2474]: info: mem_handle_event: instance=54, nodes=2, new=2, lost=0, n_idx=0, new_idx=0, old_idx=4 Nov 9 16:58:40 vm-1 cib: [2474]: info: cib_ccm_msg_callback: Processing CCM event=NOT PRIMARY (id=54) Nov 9 16:58:40 vm-1 crmd: [2478]: info: crm_update_peer_proc: vm-0.ais is now offline Nov 9 16:58:40 vm-1 crmd: [2478]: info: mem_handle_event: Got an event OC_EV_MS_NOT_PRIMARY from ccm Nov 9 16:58:40 vm-1 crmd: [2478]: info: mem_handle_event: instance=54, nodes=2, new=2, lost=0, n_idx=0, new_idx=0, old_idx=4 Nov 9 16:58:40 vm-1 crmd: [2478]: info: crmd_ccm_msg_callback: Quorum lost after event=NOT PRIMARY (id=54) Nov 9 16:58:41 vm-1 ccm: [2473]: info: ccm_state_sent_memlistreq: directly callccm_compute_and_send_final_memlist() Nov 9 16:58:41 vm-1 ccm: [2473]: debug: quorum plugin: majority Nov 9 16:58:41 vm-1 ccm: [2473]: debug: cluster:linux-ha, member_count=1, member_quorum_votes=100 Nov 9 16:58:41 vm-1 ccm: [2473]: debug: total_node_count=2, total_quorum_votes=200 Nov 9 16:58:41 vm-1 ccm: [2473]: debug: quorum plugin: twonodes Nov 9 16:58:41 vm-1 ccm: [2473]: debug: cluster:linux-ha, member_count=1, member_quorum_votes=100 Nov 9 16:58:41 vm-1 ccm: [2473]: debug: total_node_count=2, total_quorum_votes=200 Nov 9 16:58:41 vm-1 ccm: [2473]: info: Break tie for 2 nodes cluster Nov 9 16:58:41 vm-1 ccm: [2473]: debug: quorum plugin: majority Nov 9 16:58:41 vm-1 cib: [2474]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Nov 9 16:58:41 vm-1 crmd: [2478]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Nov 9 16:58:41 vm-1 cib: [2474]: info: mem_handle_event: no mbr_track info Nov 9 16:58:41 vm-1 crmd: [2478]: info: mem_handle_event: no mbr_track info Nov 9 16:58:41 vm-1 cib: [2474]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm Nov 9 16:58:41 vm-1 crmd: [2478]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm Nov 9 16:58:41 vm-1 crmd: [2478]: info: mem_handle_event: instance=55, nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=3 Nov 9 16:58:41 vm-1 crmd: [2478]: info: crmd_ccm_msg_callback: Quorum (re)attained after event=NEW MEMBERSHIP (id=55) Nov 9 16:58:41 vm-1 crmd: [2478]: info: ccm_event_detail: NEW MEMBERSHIP: trans=55, nodes=1, new=0, lost=1 n_idx=0, new_idx=1, old_idx=3 Nov 9 16:58:41 vm-1 cib: [2474]: info: mem_handle_event: instance=55, nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=3 Nov 9 16:58:41 vm-1 crmd: [2478]: info: ccm_event_detail: #011CURRENT: vm-1 [nodeid=1, born=55] Nov 9 16:58:41 vm-1 cib: [2474]: info: cib_ccm_msg_callback: Processing CCM event=NEW MEMBERSHIP (id=55) Nov 9 16:58:41 vm-1 crmd: [2478]: info: ccm_event_detail: #011LOST: vm-0 [nodeid=0, born=52] Nov 9 16:58:41 vm-1 cib: [2474]: info: crm_update_peer: Node vm-0: id=0 state=lost (new) addr=(null) votes=-1 born=52 seen=54 proc=00000000000000000000000000000302 Nov 9 16:58:41 vm-1 ccm: [2473]: debug: cluster:linux-ha, member_count=1, member_quorum_votes=100 Nov 9 16:58:41 vm-1 ccm: [2473]: debug: total_node_count=2, total_quorum_votes=200 Nov 9 16:58:41 vm-1 ccm: [2473]: debug: quorum plugin: twonodes Nov 9 16:58:41 vm-1 ccm: [2473]: debug: cluster:linux-ha, member_count=1, member_quorum_votes=100 Nov 9 16:58:41 vm-1 ccm: [2473]: debug: total_node_count=2, total_quorum_votes=200 as part of the recovery it unsets the peer VM (VM-0) as DC & does an election with the single member that it has & moves itself as the DC with starting pengine Nov 9 16:58:41 vm-1 ccm: [2473]: info: Break tie for 2 nodes cluster Nov 9 16:58:41 vm-1 crmd: [2478]: info: ais_status_callback: status: vm-0 is now lost (was member) Nov 9 16:58:41 vm-1 crmd: [2478]: info: crm_update_peer: Node vm-0: id=0 state=lost (new) addr=(null) votes=-1 born=52 seen=54 proc=00000000000000000000000000000200 Nov 9 16:58:41 vm-1 crmd: [2478]: WARN: check_dead_member: Our DC node (vm-0) left the cluster Nov 9 16:58:41 vm-1 crmd: [2478]: info: do_state_transition: State transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION cause=C_FSA_INTERNAL origin=check_dead_member ] Nov 9 16:58:41 vm-1 crmd: [2478]: info: update_dc: Unset DC vm-0 Nov 9 16:58:41 vm-1 crmd: [2478]: info: do_state_transition: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=do_election_check ] Nov 9 16:58:41 vm-1 crmd: [2478]: info: do_te_control: Registering TE UUID: 4717ec7c-684e-4718-998f-fb1938efed3b Nov 9 16:58:41 vm-1 crmd: [2478]: info: set_graph_functions: Setting custom graph functions Nov 9 16:58:41 vm-1 crmd: [2478]: info: unpack_graph: Unpacked transition -1: 0 actions in 0 synapses Nov 9 16:58:41 vm-1 crmd: [2478]: info: start_subsystem: Starting sub-system "pengine" Nov 9 16:58:41 vm-1 pengine: [25018]: info: Invoked: /usr/lib/heartbeat/pengine Nov 9 16:58:45 vm-1 crmd: [2478]: info: do_dc_takeover: Taking over DC status for this partition Nov 9 16:58:45 vm-1 cib: [2474]: info: cib_process_readwrite: We are now in R/W mode Nov 9 16:58:45 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_master for section 'all' (origin=local/crmd/28, version=1.30.149): ok (rc=0) Nov 9 16:58:45 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/29, version=1.30.150): ok (rc=0) Nov 9 16:58:45 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/31, version=1.30.151): ok (rc=0) Nov 9 16:58:45 vm-1 crmd: [2478]: info: join_make_offer: Making join offers based on membership 55 Nov 9 16:58:45 vm-1 crmd: [2478]: info: do_dc_join_offer_all: join-1: Waiting on 1 outstanding join acks Nov 9 16:58:45 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/33, version=1.30.152): ok (rc=0) Nov 9 16:58:45 vm-1 crmd: [2478]: info: config_query_callback: Shutdown escalation occurs after: 1200000ms Nov 9 16:58:45 vm-1 crmd: [2478]: info: config_query_callback: Checking for expired actions every 900000ms Nov 9 16:58:45 vm-1 crmd: [2478]: info: update_dc: Set DC to vm-1 (3.0.5) Nov 9 16:58:46 vm-1 crmd: [2478]: info: do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ] Nov 9 16:58:46 vm-1 crmd: [2478]: info: do_state_transition: All 1 cluster nodes responded to the join offer. Nov 9 16:58:46 vm-1 crmd: [2478]: info: do_dc_join_finalize: join-1: Syncing the CIB from vm-1 to the rest of the cluster Nov 9 16:58:46 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_sync for section 'all' (origin=local/crmd/36, version=1.30.152): ok (rc=0) Nov 9 16:58:46 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/37, version=1.30.153): ok (rc=0) Nov 9 16:58:46 vm-1 lrmd: [2475]: info: stonith_api_device_metadata: looking up external/vmstonith/heartbeat metadata Nov 9 16:58:46 vm-1 vmstonith: Invoked as: /usr/lib/stonith/plugins/external/vmstonith getinfo-devdescr Nov 9 16:58:46 vm-1 vmstonith: Invoked as: /usr/lib/stonith/plugins/external/vmstonith getinfo-devid Nov 9 16:58:46 vm-1 vmstonith: Invoked as: /usr/lib/stonith/plugins/external/vmstonith getinfo-xml while this is happening a late heartbeat arrived from peer VM which triggers aborting the current shutdown action Nov 9 16:58:47 vm-1 heartbeat: [2378]: CRIT: Cluster node vm-0 returning after partition. Nov 9 16:58:47 vm-1 crmd: [2478]: notice: crmd_ha_status_callback: Status update: Node vm-0 now has status [active] Nov 9 16:58:47 vm-1 heartbeat: [2378]: info: For information on cluster partitions, See URL: http://linux-ha.org/wiki/Split_Brain Nov 9 16:58:47 vm-1 crmd: [2478]: info: crm_update_peer_proc: vm-0.ais is now online Nov 9 16:58:47 vm-1 heartbeat: [2378]: WARN: Deadtime value may be too small. Nov 9 16:58:47 vm-1 heartbeat: [2378]: info: See FAQ for information on tuning deadtime. Nov 9 16:58:47 vm-1 heartbeat: [2378]: info: URL: http://linux-ha.org/wiki/FAQ#Heavy_Load Nov 9 16:58:47 vm-1 heartbeat: [2378]: info: Link vm-0:eth0 up. Nov 9 16:58:47 vm-1 heartbeat: [2378]: WARN: Late heartbeat: Node vm-0: interval 14000 ms Nov 9 16:58:47 vm-1 heartbeat: [2378]: info: Status update for node vm-0: status active Nov 9 16:58:47 vm-1 crmd: [2478]: info: do_dc_join_ack: join-1: Updating node state to member for vm-1 Nov 9 16:58:47 vm-1 heartbeat: [2378]: info: Link vm-0:eth2 up. Nov 9 16:58:47 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='vm-1']/lrm (origin=local/crmd/39, version=1.30.155): ok (rc=0) Nov 9 16:58:47 vm-1 crmd: [2478]: info: erase_xpath_callback: Deletion of "//node_state[@uname='vm-1']/lrm": ok (rc=0) Nov 9 16:58:47 vm-1 crmd: [2478]: info: do_state_transition: State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ] Nov 9 16:58:47 vm-1 crmd: [2478]: info: populate_cib_nodes_ha: Requesting the list of configured nodes Nov 9 16:58:47 vm-1 crmd: [2478]: info: do_state_transition: All 1 cluster nodes are eligible to run resources. Nov 9 16:58:47 vm-1 crmd: [2478]: info: do_dc_join_final: Ensuring DC, quorum and node attributes are up-to-date Nov 9 16:58:47 vm-1 crmd: [2478]: info: crm_update_quorum: Updating quorum status to true (call=43) Nov 9 16:58:47 vm-1 attrd: [2477]: notice: attrd_local_callback: Sending full refresh (origin=crmd) Nov 9 16:58:47 vm-1 crmd: [2478]: info: abort_transition_graph: do_te_invoke:167 - Triggered transition abort (complete=1) : Peer Cancelled Nov 9 16:58:47 vm-1 crmd: [2478]: info: do_pe_invoke: Query 44: Requesting the current CIB: S_POLICY_ENGINE Nov 9 16:58:47 vm-1 crmd: [2478]: WARN: crmd_ha_msg_callback: Ignoring HA message (op=vote) from vm-0: not in our membership list (size=1) Nov 9 16:58:47 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/41, version=1.30.157): ok (rc=0) Nov 9 16:58:47 vm-1 crmd: [2478]: WARN: match_down_event: No match for shutdown action on 00000281-0281-0000-2b91-000000000000 Nov 9 16:58:47 vm-1 crmd: [2478]: info: te_update_diff: Stonith/shutdown of 00000281-0281-0000-2b91-000000000000 not matched Nov 9 16:58:47 vm-1 crmd: [2478]: info: abort_transition_graph: te_update_diff:215 - Triggered transition abort (complete=1, tag=node_state, id=00000281-0281-0000-2b91-000000000000, magic=NA, cib=1.30.158) : Node failure Nov 9 16:58:47 vm-1 crmd: [2478]: info: do_pe_invoke: Query 45: Requesting the current CIB: S_POLICY_ENGINE Nov 9 16:58:47 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/43, version=1.30.159): ok (rc=0) Nov 9 16:58:47 vm-1 crmd: [2478]: info: do_pe_invoke_callback: Invoking the PE: query=45, ref=pe_calc-dc-1447106327-10, seq=55, quorate=1 But pengine is disconnected with this at this point & goes ahead asking for a reboot of peer VM Nov 9 16:58:47 vm-1 pengine: [25018]: notice: unpack_config: On loss of CCM Quorum: Ignore Nov 9 16:58:47 vm-1 attrd: [2477]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Nov 9 16:58:47 vm-1 pengine: [25018]: WARN: pe_fence_node: Node vm-0 will be fenced because it is un-expectedly down Nov 9 16:58:47 vm-1 pengine: [25018]: WARN: determine_online_status: Node vm-0 is unclean Nov 9 16:58:47 vm-1 pengine: [25018]: WARN: custom_action: Action VM-FENCE:0_stop_0 on vm-0 is unrunnable (offline) Nov 9 16:58:47 vm-1 pengine: [25018]: WARN: custom_action: Marking node vm-0 unclean Nov 9 16:58:47 vm-1 pengine: [25018]: WARN: stage6: Scheduling Node vm-0 for STONITH Nov 9 16:58:47 vm-1 pengine: [25018]: notice: LogActions: Stop VM-FENCE:0#011(vm-0) Nov 9 16:58:47 vm-1 pengine: [25018]: notice: LogActions: Leave VM-FENCE:1#011(Started vm-1) Nov 9 16:58:47 vm-1 crmd: [2478]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Nov 9 16:58:47 vm-1 crmd: [2478]: info: unpack_graph: Unpacked transition 0: 33 actions in 33 synapses Nov 9 16:58:47 vm-1 crmd: [2478]: info: do_te_invoke: Processing graph 0 (ref=pe_calc-dc-1447106327-10) derived from /var/lib/pengine/pe-warn-1.bz2 Nov 9 16:58:47 vm-1 crmd: [2478]: info: te_pseudo_action: Pseudo action 14 fired and confirmed Nov 9 16:58:47 vm-1 crmd: [2478]: info: te_pseudo_action: Pseudo action 9 fired and confirmed Nov 9 16:58:47 vm-1 crmd: [2478]: info: te_pseudo_action: Pseudo action 15 fired and confirmed Nov 9 16:58:47 vm-1 crmd: [2478]: info: te_fence_node: Executing reboot fencing operation (51) on vm-0 (timeout=90000) Nov 9 16:58:47 vm-1 stonith-ng: [2476]: info: initiate_remote_stonith_op: Initiating remote operation reboot for vm-0: c3bda7ff-b6ec-433e-86a1-e570dbaacc8b Nov 9 16:58:47 vm-1 stonith-ng: [2476]: info: can_fence_host_with_device: VM-FENCE:1 can fence vm-0: none Nov 9 16:58:48 vm-1 ccm: [2473]: debug: quorum plugin: majority Nov 9 16:58:48 vm-1 ccm: [2473]: debug: cluster:linux-ha, member_count=1, member_quorum_votes=100 Nov 9 16:58:48 vm-1 ccm: [2473]: debug: total_node_count=2, total_quorum_votes=200 Nov 9 16:58:48 vm-1 stonith-ng: [2476]: info: process_remote_stonith_query: Ignoring reply from vm-0, hosts are not permitted to commit suicide Nov 9 16:58:48 vm-1 ccm: [2473]: debug: quorum plugin: twonodes Nov 9 16:58:48 vm-1 ccm: [2473]: debug: cluster:linux-ha, member_count=1, member_quorum_votes=100 Nov 9 16:58:48 vm-1 stonith-ng: [2476]: info: call_remote_stonith: Requesting that vm-1 perform op reboot vm-0 Nov 9 16:58:48 vm-1 ccm: [2473]: debug: total_node_count=2, total_quorum_votes=200 Nov 9 16:58:48 vm-1 crmd: [2478]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Nov 9 16:58:48 vm-1 ccm: [2473]: info: Break tie for 2 nodes cluster Nov 9 16:58:48 vm-1 cib: [2474]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Nov 9 16:58:48 vm-1 crmd: [2478]: info: mem_handle_event: no mbr_track info Nov 9 16:58:48 vm-1 cib: [2474]: info: mem_handle_event: no mbr_track info Nov 9 16:58:48 vm-1 crmd: [2478]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm Nov 9 16:58:48 vm-1 cib: [2474]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm Nov 9 16:58:48 vm-1 crmd: [2478]: info: mem_handle_event: instance=56, nodes=1, new=0, lost=0, n_idx=0, new_idx=1, old_idx=3 Nov 9 16:58:48 vm-1 cib: [2474]: info: mem_handle_event: instance=56, nodes=1, new=0, lost=0, n_idx=0, new_idx=1, old_idx=3 Nov 9 16:58:48 vm-1 crmd: [2478]: info: crmd_ccm_msg_callback: Quorum (re)attained after event=NEW MEMBERSHIP (id=56) Nov 9 16:58:48 vm-1 cib: [2474]: info: cib_ccm_msg_callback: Processing CCM event=NEW MEMBERSHIP (id=56) Nov 9 16:58:48 vm-1 crmd: [2478]: info: ccm_event_detail: NEW MEMBERSHIP: trans=56, nodes=1, new=0, lost=0 n_idx=0, new_idx=1, old_idx=3 Nov 9 16:58:48 vm-1 crmd: [2478]: info: ccm_event_detail: #011CURRENT: vm-1 [nodeid=1, born=56] Nov 9 16:58:48 vm-1 crmd: [2478]: info: populate_cib_nodes_ha: Requesting the list of configured nodes Nov 9 16:58:48 vm-1 stonith-ng: [2476]: info: stonith_fence: Exec <stonith_command t="stonith-ng" st_async_id="c3bda7ff-b6ec-433e-86a1-e570dbaacc8b" st_op="st_fence" st_callid="0" st_callopt="0" st_remote_op="c3bda7ff-b6ec-433e-86a1-e570dbaacc8b" st_target="vm-0" st_device_action="reboot" st_timeout="81000" dest="vm-1" oseq="2" from_id="stonith-ng" to_id="stonith-ng" client_gen="2" src="vm-1" seq="1015" hg="5487db2a" ts="56411718" ld="0.02 0.04 0.05 4/369 25102" ttl="3" auth="1 06b58c47479e1fb80d0979b4b8f367b60190789f" /> Nov 9 16:58:48 vm-1 stonith-ng: [2476]: info: can_fence_host_with_device: VM-FENCE:1 can fence vm-0: none Nov 9 16:58:48 vm-1 stonith-ng: [2476]: info: stonith_fence: Found 1 matching devices for 'vm-0' Nov 9 16:58:48 vm-1 stonith-ng: [2476]: info: stonith_command: Processed st_fence from vm-1: rc=-1 Nov 9 16:58:48 vm-1 stonith-ng: [2476]: info: make_args: reboot-ing node 'vm-0' as 'port=vm-0' Nov 9 16:58:48 vm-1 vmstonith: Invoked as: /usr/lib/stonith/plugins/external/vmstonith getconfignames Nov 9 16:58:48 vm-1 vmstonith: Invoked as: /usr/lib/stonith/plugins/external/vmstonith reset vm-0 Nov 9 16:58:48 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/47, version=1.30.164): ok (rc=0) A membership event arrives in Nov 9 16:58:49 vm-1 ccm: [2473]: debug: quorum plugin: majority Nov 9 16:58:49 vm-1 ccm: [2473]: debug: cluster:linux-ha, member_count=2, member_quorum_votes=200 Nov 9 16:58:49 vm-1 ccm: [2473]: debug: total_node_count=2, total_quorum_votes=200 Nov 9 16:58:49 vm-1 crmd: [2478]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Nov 9 16:58:49 vm-1 crmd: [2478]: info: mem_handle_event: no mbr_track info Nov 9 16:58:49 vm-1 cib: [2474]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Nov 9 16:58:49 vm-1 crmd: [2478]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm Nov 9 16:58:49 vm-1 cib: [2474]: info: mem_handle_event: no mbr_track info Nov 9 16:58:49 vm-1 cib: [2474]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm Nov 9 16:58:49 vm-1 cib: [2474]: info: mem_handle_event: instance=57, nodes=2, new=1, lost=0, n_idx=0, new_idx=2, old_idx=4 Nov 9 16:58:49 vm-1 cib: [2474]: info: cib_ccm_msg_callback: Processing CCM event=NEW MEMBERSHIP (id=57) Nov 9 16:58:49 vm-1 cib: [2474]: info: crm_update_peer: Node vm-0: id=0 state=member (new) addr=(null) votes=-1 born=57 seen=57 proc=00000000000000000000000000000302 Nov 9 16:58:49 vm-1 crmd: [2478]: info: mem_handle_event: instance=57, nodes=2, new=1, lost=0, n_idx=0, new_idx=2, old_idx=4 Nov 9 16:58:49 vm-1 crmd: [2478]: info: crmd_ccm_msg_callback: Quorum (re)attained after event=NEW MEMBERSHIP (id=57) Nov 9 16:58:49 vm-1 crmd: [2478]: info: ccm_event_detail: NEW MEMBERSHIP: trans=57, nodes=2, new=1, lost=0 n_idx=0, new_idx=2, old_idx=4 Nov 9 16:58:49 vm-1 crmd: [2478]: info: ccm_event_detail: #011CURRENT: vm-1 [nodeid=1, born=1] Nov 9 16:58:49 vm-1 crmd: [2478]: info: ccm_event_detail: #011CURRENT: vm-0 [nodeid=0, born=57] Nov 9 16:58:49 vm-1 crmd: [2478]: info: ccm_event_detail: #011NEW: vm-0 [nodeid=0, born=57] Nov 9 16:58:49 vm-1 crmd: [2478]: info: ais_status_callback: status: vm-0 is now member (was lost) Nov 9 16:58:49 vm-1 crmd: [2478]: WARN: match_down_event: No match for shutdown action on vm-0 Nov 9 16:58:49 vm-1 crmd: [2478]: info: crm_update_peer: Node vm-0: id=0 state=member (new) addr=(null) votes=-1 born=57 seen=57 proc=00000000000000000000000000000202 Nov 9 16:58:49 vm-1 crmd: [2478]: info: populate_cib_nodes_ha: Requesting the list of configured nodes Nov 9 16:58:49 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='vm-0']/lrm (origin=local/crmd/49, version=1.30.166): ok (rc=0) Nov 9 16:58:49 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='vm-0']/transient_attributes (origin=local/crmd/50, version=1.30.167): ok (rc=0) Nov 9 16:58:49 vm-1 crmd: [2478]: info: erase_xpath_callback: Deletion of "//node_state[@uname='vm-0']/lrm": ok (rc=0) Nov 9 16:58:49 vm-1 crmd: [2478]: info: abort_transition_graph: te_update_diff:178 - Triggered transition abort (complete=0, tag=transient_attributes, id=00000281-0281-0000-2b91-000000000000, magic=NA, cib=1.30.167) : Transient attribute: removal Nov 9 16:58:49 vm-1 crmd: [2478]: info: erase_xpath_callback: Deletion of "//node_state[@uname='vm-0']/transient_attributes": ok (rc=0) Nov 9 16:58:49 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/51, version=1.30.168): ok (rc=0) as part of this, this VM unsets itself as DC and considers the peer VM as the DC. But it didnt update_dc() with peer VM VM-0) as the DC at this point Nov 9 16:58:50 vm-1 crmd: [2478]: WARN: crmd_ha_msg_filter: Another DC detected: vm-0 (op=noop) Nov 9 16:58:50 vm-1 crmd: [2478]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_ELECTION [ input=I_ELECTION cause=C_FSA_INTERNAL origin=crmd_ha_msg_filter ] Nov 9 16:58:50 vm-1 crmd: [2478]: info: update_dc: Unset DC vm-1 Nov 9 16:58:51 vm-1 crmd: [2478]: info: do_state_transition: State transition S_ELECTION -> S_RELEASE_DC [ input=I_RELEASE_DC cause=C_FSA_INTERNAL origin=do_election_count_vote ] Nov 9 16:58:51 vm-1 crmd: [2478]: info: do_dc_release: DC role released Nov 9 16:58:51 vm-1 crmd: [2478]: info: stop_subsystem: Sent -TERM to pengine: [25018] Nov 9 16:58:51 vm-1 crmd: [2478]: WARN: destroy_action: Cancelling timer for action 3 (src=57) Nov 9 16:58:51 vm-1 crmd: [2478]: info: do_te_control: Transitioner is now inactive Nov 9 16:58:51 vm-1 crmd: [2478]: info: do_state_transition: State transition S_RELEASE_DC -> S_PENDING [ input=I_RELEASE_SUCCESS cause=C_FSA_INTERNAL origin=do_dc_release ] Nov 9 16:58:51 vm-1 crmd: [2478]: info: crmdManagedChildDied: Process pengine:[25018] exited (signal=0, exitcode=0) Nov 9 16:58:51 vm-1 crmd: [2478]: info: pe_msg_dispatch: Received HUP from pengine:[25018] Nov 9 16:58:51 vm-1 crmd: [2478]: info: pe_connection_destroy: Connection to the Policy Engine released Nov 9 16:58:51 vm-1 cib: [2474]: info: cib_process_readwrite: We are now in R/O mode The peer VM STONITH that was triggered manages to reboot the peer VM & at this point the membership is changed Nov 9 16:58:59 vm-1 stonith-ng: [2476]: info: log_operation: Operation 'reboot' [25103] (call 0 from (null)) for host 'vm-0' with device 'VM-FENCE:1' returned: 0 Nov 9 16:58:59 vm-1 stonith-ng: [2476]: info: log_operation: VM-FENCE:1: Performing: stonith -t external/vmstonith -T reset vm-0 Nov 9 16:58:59 vm-1 stonith-ng: [2476]: info: log_operation: VM-FENCE:1: success: vm-0 0 Nov 9 16:58:59 vm-1 stonith-ng: [2476]: info: stonith_device_execute: Nothing to do for VM-FENCE:1 Nov 9 16:58:59 vm-1 stonith-ng: [2476]: info: process_remote_stonith_exec: ExecResult <st-reply st_origin="stonith_construct_async_reply" t="stonith-ng" st_op="st_notify" st_remote_op="c3bda7ff-b6ec-433e-86a1-e570dbaacc8b" st_callid="0" st_callopt="0" st_rc="0" st_output="Performing: stonith -t external/vmstonith -T reset vm-0#012success: vm-0 0#012" from_id="stonith-ng" to_id="stonith-ng" client_gen="2" src="vm-1" seq="102d" hg="5487db2a" ts="56411723" ld="0.02 0.04 0.05 1/370 25223" ttl="3" auth="1 e388a45fa82902e4d2cc13372bd016c90878a052" /> Nov 9 16:58:59 vm-1 stonith-ng: [2476]: info: remote_op_done: Notifing clients of c3bda7ff-b6ec-433e-86a1-e570dbaacc8b (reboot of vm-0 from 8a97a051-0052-43fe-b06e-7dd766053c55 by vm-1): 2, rc=0 Nov 9 16:58:59 vm-1 stonith-ng: [2476]: info: stonith_notify_client: Sending st_fence-notification to client 2478/886556f2-863a-46c8-9c59-33f84f5d8fcf Nov 9 16:58:59 vm-1 crmd: [2478]: info: tengine_stonith_callback: StonithOp <st-reply st_origin="stonith_construct_async_reply" t="stonith-ng" st_op="reboot" st_remote_op="c3bda7ff-b6ec-433e-86a1-e570dbaacc8b" st_callid="0" st_callopt="0" st_rc="0" st_output="Performing: stonith -t external/vmstonith -T reset vm-0#012success: vm-0 0#012" from_id="stonith-ng" to_id="stonith-ng" client_gen="2" src="vm-1" seq="102d" hg="5487db2a" ts="56411723" ld="0.02 0.04 0.05 1/370 25223" ttl="3" auth="1 e388a45fa82902e4d2cc13372bd016c90878a052" state="2" st_target="vm-0" /> Nov 9 16:58:59 vm-1 crmd: [2478]: info: tengine_stonith_callback: Stonith operation 2/51:0:0:4717ec7c-684e-4718-998f-fb1938efed3b: OK (0) Nov 9 16:58:59 vm-1 crmd: [2478]: info: tengine_stonith_notify: Peer vm-0 was terminated (reboot) by vm-1 for vm-1 (ref=c3bda7ff-b6ec-433e-86a1-e570dbaacc8b): OK Nov 9 16:59:01 vm-1 heartbeat: [2378]: WARN: node vm-0: is dead Nov 9 16:59:01 vm-1 heartbeat: [2378]: info: Link vm-0:eth0 dead. Nov 9 16:59:01 vm-1 heartbeat: [2378]: info: Link vm-0:eth2 dead. Nov 9 16:59:01 vm-1 ccm: [2473]: debug: quorum plugin: majority Nov 9 16:59:01 vm-1 ccm: [2473]: debug: cluster:linux-ha, member_count=1, member_quorum_votes=100 Nov 9 16:59:01 vm-1 ccm: [2473]: debug: total_node_count=2, total_quorum_votes=200 Nov 9 16:59:01 vm-1 ccm: [2473]: debug: quorum plugin: twonodes Nov 9 16:59:01 vm-1 ccm: [2473]: debug: cluster:linux-ha, member_count=1, member_quorum_votes=100 Nov 9 16:59:01 vm-1 ccm: [2473]: debug: total_node_count=2, total_quorum_votes=200 Nov 9 16:59:01 vm-1 ccm: [2473]: info: Break tie for 2 nodes cluster Nov 9 16:59:01 vm-1 crmd: [2478]: notice: crmd_ha_status_callback: Status update: Node vm-0 now has status [dead] Nov 9 16:59:01 vm-1 cib: [2474]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Nov 9 16:59:01 vm-1 cib: [2474]: info: mem_handle_event: no mbr_track info Nov 9 16:59:01 vm-1 cib: [2474]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm Nov 9 16:59:01 vm-1 cib: [2474]: info: mem_handle_event: instance=58, nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=3 Nov 9 16:59:01 vm-1 cib: [2474]: info: cib_ccm_msg_callback: Processing CCM event=NEW MEMBERSHIP (id=58) Nov 9 16:59:01 vm-1 cib: [2474]: info: crm_update_peer: Node vm-0: id=0 state=lost (new) addr=(null) votes=-1 born=57 seen=57 proc=00000000000000000000000000000302 Nov 9 16:59:01 vm-1 crmd: [2478]: info: crm_update_peer_proc: vm-0.ais is now offline Nov 9 16:59:01 vm-1 crmd: [2478]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Nov 9 16:59:01 vm-1 crmd: [2478]: info: mem_handle_event: no mbr_track info Nov 9 16:59:01 vm-1 crmd: [2478]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm Nov 9 16:59:01 vm-1 crmd: [2478]: info: mem_handle_event: instance=58, nodes=1, new=0, lost=1, n_idx=0, new_idx=1, old_idx=3 Nov 9 16:59:01 vm-1 crmd: [2478]: info: crmd_ccm_msg_callback: Quorum (re)attained after event=NEW MEMBERSHIP (id=58) Nov 9 16:59:01 vm-1 crmd: [2478]: info: ccm_event_detail: NEW MEMBERSHIP: trans=58, nodes=1, new=0, lost=1 n_idx=0, new_idx=1, old_idx=3 Nov 9 16:59:01 vm-1 crmd: [2478]: info: ccm_event_detail: #011CURRENT: vm-1 [nodeid=1, born=58] Nov 9 16:59:01 vm-1 crmd: [2478]: info: ccm_event_detail: #011LOST: vm-0 [nodeid=0, born=57] Nov 9 16:59:01 vm-1 crmd: [2478]: info: ais_status_callback: status: vm-0 is now lost (was member) Nov 9 16:59:01 vm-1 crmd: [2478]: info: crm_update_peer: Node vm-0: id=0 state=lost (new) addr=(null) votes=-1 born=57 seen=57 proc=00000000000000000000000000000200 Nov 9 17:00:08 vm-1 stonith-ng: [2476]: info: stonith_command: Processed st_execute from lrmd: rc=-1 Nov 9 17:00:08 vm-1 vmstonith: Invoked as: /usr/lib/stonith/plugins/external/vmstonith getconfignames Nov 9 17:00:08 vm-1 vmstonith: Invoked as: /usr/lib/stonith/plugins/external/vmstonith status Nov 9 17:00:09 vm-1 stonith: [26002]: info: external/vmstonith device OK. Nov 9 17:00:09 vm-1 stonith-ng: [2476]: info: log_operation: VM-FENCE:1: Performing: stonith -t external/vmstonith -S Nov 9 17:00:09 vm-1 stonith-ng: [2476]: info: log_operation: VM-FENCE:1: success: 0 Nov 9 17:00:09 vm-1 stonith-ng: [2476]: info: stonith_device_execute: Nothing to do for VM-FENCE:1 but by this time vm-0 is not set as the DC. So DC takeover doesnt happen & VM-1 still remains as a member not doing any action. When we managed to reboot the underlying node from crash, heartbeat on the peer VM was restarted Nov 9 17:29:02 vm-1 heartbeat: [2378]: info: Heartbeat restart on node vm-0 Nov 9 17:29:02 vm-1 heartbeat: [2378]: info: Link vm-0:eth0 up. Nov 9 17:29:02 vm-1 heartbeat: [2378]: info: Status update for node vm-0: status init Nov 9 17:29:02 vm-1 heartbeat: [2378]: info: Link vm-0:eth2 up. Nov 9 17:29:02 vm-1 heartbeat: [2378]: info: Status update for node vm-0: status up Nov 9 17:29:02 vm-1 crmd: [2478]: notice: crmd_ha_status_callback: Status update: Node vm-0 now has status [init] Nov 9 17:29:02 vm-1 crmd: [2478]: info: crm_update_peer_proc: vm-0.ais is now online Nov 9 17:29:02 vm-1 crmd: [2478]: notice: crmd_ha_status_callback: Status update: Node vm-0 now has status [up] Nov 9 17:29:03 vm-1 heartbeat: [2378]: debug: get_delnodelist: delnodelist= Nov 9 17:29:03 vm-1 heartbeat: [2378]: info: all clients are now paused Nov 9 17:29:03 vm-1 heartbeat: [2378]: info: Status update for node vm-0: status active Nov 9 17:29:03 vm-1 crmd: [2478]: notice: crmd_ha_status_callback: Status update: Node vm-0 now has status [active] Nov 9 17:29:04 vm-1 cib: [2474]: info: cib_client_status_callback: Status update: Client vm-0/cib now has status [join] Nov 9 17:29:05 vm-1 heartbeat: [2378]: info: all clients are now resumed Nov 9 17:29:11 vm-1 heartbeat: [2378]: WARN: 1 lost packet(s) for [vm-0] [19:21] Nov 9 17:29:11 vm-1 heartbeat: [2378]: info: No pkts missing from vm-0! Nov 9 17:29:11 vm-1 ccm: [2473]: debug: quorum plugin: majority Nov 9 17:29:11 vm-1 crmd: [2478]: notice: crmd_client_status_callback: Status update: Client vm-0/crmd now has status [online] (DC=false) Nov 9 17:29:11 vm-1 ccm: [2473]: debug: cluster:linux-ha, member_count=2, member_quorum_votes=200 Nov 9 17:29:11 vm-1 ccm: [2473]: debug: total_node_count=2, total_quorum_votes=200 Nov 9 17:29:11 vm-1 cib: [2474]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Nov 9 17:29:11 vm-1 cib: [2474]: info: mem_handle_event: no mbr_track info Nov 9 17:29:11 vm-1 cib: [2474]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm Nov 9 17:29:11 vm-1 cib: [2474]: info: mem_handle_event: instance=59, nodes=2, new=1, lost=0, n_idx=0, new_idx=2, old_idx=4 Nov 9 17:29:11 vm-1 cib: [2474]: info: cib_ccm_msg_callback: Processing CCM event=NEW MEMBERSHIP (id=59) Nov 9 17:29:11 vm-1 cib: [2474]: info: crm_update_peer: Node vm-0: id=0 state=member (new) addr=(null) votes=-1 born=59 seen=59 proc=00000000000000000000000000000302 Nov 9 17:29:11 vm-1 crmd: [2478]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm Nov 9 17:29:11 vm-1 crmd: [2478]: info: mem_handle_event: no mbr_track info Nov 9 17:29:11 vm-1 crmd: [2478]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm Nov 9 17:29:11 vm-1 crmd: [2478]: info: mem_handle_event: instance=59, nodes=2, new=1, lost=0, n_idx=0, new_idx=2, old_idx=4 Nov 9 17:29:11 vm-1 crmd: [2478]: info: crmd_ccm_msg_callback: Quorum (re)attained after event=NEW MEMBERSHIP (id=59) Nov 9 17:29:11 vm-1 crmd: [2478]: info: ccm_event_detail: NEW MEMBERSHIP: trans=59, nodes=2, new=1, lost=0 n_idx=0, new_idx=2, old_idx=4 Nov 9 17:29:11 vm-1 crmd: [2478]: info: ccm_event_detail: #011CURRENT: vm-1 [nodeid=1, born=1] Nov 9 17:29:11 vm-1 crmd: [2478]: info: ccm_event_detail: #011CURRENT: vm-0 [nodeid=0, born=59] Nov 9 17:29:11 vm-1 crmd: [2478]: info: ccm_event_detail: #011NEW: vm-0 [nodeid=0, born=59] Nov 9 17:29:11 vm-1 crmd: [2478]: info: ais_status_callback: status: vm-0 is now member (was lost) Nov 9 17:29:11 vm-1 crmd: [2478]: info: crm_update_peer: Node vm-0: id=0 state=member (new) addr=(null) votes=-1 born=59 seen=59 proc=00000000000000000000000000000202 Nov 9 17:29:12 vm-1 heartbeat: [2378]: WARN: 1 lost packet(s) for [vm-0] [25:27] Nov 9 17:29:12 vm-1 heartbeat: [2378]: info: No pkts missing from vm-0! Only after this point an election was done & this VM (VM-1) took over as the DC & started resources Nov 9 17:29:25 vm-1 crmd: [2478]: info: do_election_count_vote: Election 2 (owner: 00000281-0281-0000-2b91-000000000000) pass: vote from vm-0 (Uptime) Nov 9 17:29:25 vm-1 crmd: [2478]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [ input=I_ELECTION cause=C_FSA_INTERNAL origin=do_election_count_vote ] Nov 9 17:29:26 vm-1 crmd: [2478]: info: do_state_transition: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=do_election_check ] Nov 9 17:29:26 vm-1 crmd: [2478]: info: do_te_control: Registering TE UUID: cae16cf4-bcc7-4027-8a84-33537c7696a3 Nov 9 17:29:26 vm-1 crmd: [2478]: info: set_graph_functions: Setting custom graph functions Nov 9 17:29:26 vm-1 crmd: [2478]: info: unpack_graph: Unpacked transition -1: 0 actions in 0 synapses Nov 9 17:29:26 vm-1 crmd: [2478]: info: start_subsystem: Starting sub-system "pengine" Nov 9 17:29:26 vm-1 pengine: [12152]: info: Invoked: /usr/lib/heartbeat/pengine Nov 9 17:29:30 vm-1 crmd: [2478]: info: do_dc_takeover: Taking over DC status for this partition Nov 9 17:29:30 vm-1 cib: [2474]: info: cib_process_readwrite: We are now in R/W mode Nov 9 17:29:30 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_master for section 'all' (origin=local/crmd/55, version=1.30.172): ok (rc=0) Nov 9 17:29:30 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/56, version=1.30.173): ok (rc=0) Nov 9 17:29:30 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/58, version=1.30.174): ok (rc=0) Nov 9 17:29:30 vm-1 crmd: [2478]: info: join_make_offer: Making join offers based on membership 59 Nov 9 17:29:30 vm-1 crmd: [2478]: info: do_dc_join_offer_all: join-2: Waiting on 2 outstanding join acks Nov 9 17:29:30 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/60, version=1.30.175): ok (rc=0) Nov 9 17:29:30 vm-1 crmd: [2478]: info: config_query_callback: Shutdown escalation occurs after: 1200000ms Nov 9 17:29:30 vm-1 crmd: [2478]: info: config_query_callback: Checking for expired actions every 900000ms Nov 9 17:29:30 vm-1 crmd: [2478]: info: update_dc: Set DC to vm-1 (3.0.5) Nov 9 17:29:31 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_sync_one for section 'all' (origin=vm-0/vm-0/(null), version=1.30.175): ok (rc=0) Nov 9 17:29:31 vm-1 crmd: [2478]: info: do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ] Nov 9 17:29:31 vm-1 crmd: [2478]: info: do_state_transition: All 2 cluster nodes responded to the join offer. Nov 9 17:29:31 vm-1 crmd: [2478]: info: do_dc_join_finalize: join-2: Syncing the CIB from vm-1 to the rest of the cluster Nov 9 17:29:31 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_sync for section 'all' (origin=local/crmd/63, version=1.30.175): ok (rc=0) Nov 9 17:29:31 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/64, version=1.30.176): ok (rc=0) Nov 9 17:29:31 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/65, version=1.30.177): ok (rc=0) Nov 9 17:29:32 vm-1 lrmd: [2475]: info: stonith_api_device_metadata: looking up external/vmstonith/heartbeat metadata Nov 9 17:29:32 vm-1 vmstonith: Invoked as: /usr/lib/stonith/plugins/external/vmstonith getinfo-devdescr Nov 9 17:29:32 vm-1 vmstonith: Invoked as: /usr/lib/stonith/plugins/external/vmstonith getinfo-devid Nov 9 17:29:32 vm-1 vmstonith: Invoked as: /usr/lib/stonith/plugins/external/vmstonith getinfo-xml Nov 9 17:29:32 vm-1 crmd: [2478]: info: do_dc_join_ack: join-2: Updating node state to member for vm-0 Nov 9 17:29:32 vm-1 crmd: [2478]: info: do_dc_join_ack: join-2: Updating node state to member for vm-1 Nov 9 17:29:32 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='vm-0']/transient_attributes (origin=vm-0/crmd/8, version=1.30.178): ok (rc=0) Nov 9 17:29:32 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='vm-0']/lrm (origin=local/crmd/66, version=1.30.179): ok (rc=0) Nov 9 17:29:32 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='vm-1']/lrm (origin=local/crmd/68, version=1.30.181): ok (rc=0) Nov 9 17:29:32 vm-1 crmd: [2478]: info: erase_xpath_callback: Deletion of "//node_state[@uname='vm-0']/lrm": ok (rc=0) Nov 9 17:29:32 vm-1 crmd: [2478]: info: erase_xpath_callback: Deletion of "//node_state[@uname='vm-1']/lrm": ok (rc=0) Nov 9 17:29:32 vm-1 crmd: [2478]: info: do_state_transition: State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ] Nov 9 17:29:32 vm-1 crmd: [2478]: info: populate_cib_nodes_ha: Requesting the list of configured nodes Nov 9 17:29:32 vm-1 crmd: [2478]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. Nov 9 17:29:32 vm-1 crmd: [2478]: info: do_dc_join_final: Ensuring DC, quorum and node attributes are up-to-date Nov 9 17:29:32 vm-1 crmd: [2478]: info: crm_update_quorum: Updating quorum status to true (call=72) Nov 9 17:29:32 vm-1 attrd: [2477]: notice: attrd_local_callback: Sending full refresh (origin=crmd) Nov 9 17:29:32 vm-1 crmd: [2478]: info: do_pe_invoke: Query 73: Requesting the current CIB: S_POLICY_ENGINE Nov 9 17:29:32 vm-1 crmd: [2478]: info: te_update_diff: Detected LRM refresh - 11 resources updated: Skipping all resource events Nov 9 17:29:32 vm-1 crmd: [2478]: info: abort_transition_graph: te_update_diff:251 - Triggered transition abort (complete=1, tag=diff, id=(null), magic=NA, cib=1.30.182) : LRM Refresh Nov 9 17:29:32 vm-1 crmd: [2478]: info: do_pe_invoke: Query 74: Requesting the current CIB: S_POLICY_ENGINE Nov 9 17:29:32 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/70, version=1.30.183): ok (rc=0) Nov 9 17:29:32 vm-1 cib: [2474]: info: cib_process_request: Operation complete: op cib_modify for section cib (origin=local/crmd/72, version=1.30.185): ok (rc=0) Nov 9 17:29:32 vm-1 crmd: [2478]: info: do_pe_invoke_callback: Invoking the PE: query=74, ref=pe_calc-dc-1447108172-26, seq=59, quorate=1 Nov 9 17:29:32 vm-1 attrd: [2477]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Thanks. --Shyam
_______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org