Can you include a crm_report for your test scenario? a) I need the pe files, but also b) parsing line wrapped logs is seriously painful On 05/07/2013, at 7:09 PM, Martin Gazak <[email protected]> wrote:
> Hello, > we are facing the problem with the simple (I hope) cluster configuration > with 2 nodes ims0 and ims1 and 3 primitives (no shared storage or > something like this where data corruption is a danger): > > - master-slave Java application ims (to be run normally on both nodes in > as master/slave, with our own OCF script) with embedded web server (to > be accessed by clients) > > - ims-ip and ims-ip-src: shared IP address and outgoing address to be > run on the ims master solely > > Below are listed the software versions, crm configuration and portions > of corosync log. > > The problem is that although most of the time the setup works (i.e if > master ims application dies, slave one is promoted and ip addresses are > remapped) but sometimes when master ims application stops (fails or is > killed), the failover does not occur - the slave ims application remains > the slave and the shared IP address remains mapped on the node with died > ims. > > I even created a testbed of 2 servers, killing the ims application from > cron every 15 minutes on supposed MAIN server to simulate the failure > and observe the failover and to replicate the problem (sometimes it > works properly for hours/days). > > For example today (July 4, 23:45 local time) the ims at ims0 was killed, > but remained Master - no failover of IP addresses was performed and ims > on ims1 remained Slave: > ============ > Last updated: Fri Jul 5 02:07:18 2013 > Last change: Thu Jul 4 23:33:46 2013 > Stack: openais > Current DC: ims0 - partition with quorum > Version: 1.1.7-61a079313275f3e9d0e85671f62c721d32ce3563 > 2 Nodes configured, 2 expected votes > 6 Resources configured. > ============ > > Online: [ ims1 ims0 ] > > Master/Slave Set: ms-ims [ims] > Masters: [ ims0 ] > Slaves: [ ims1 ] > Clone Set: clone-cluster-mon [cluster-mon] > Started: [ ims0 ims1 ] > Resource Group: on-ims-master > ims-ip (ocf::heartbeat:IPaddr2): Started ims0 > ims-ip-src (ocf::heartbeat:IPsrcaddr): Started ims0 > > The command 'crm node standby' on ims0 did not fix the thing: ims0 > remained master (although standby): > > Node ims0: standby > Online: [ ims1 ] > > Master/Slave Set: ms-ims [ims] > ims:0 (ocf::microstepmis:imsMS): Slave ims0 FAILED > Slaves: [ ims1 ] > Clone Set: clone-cluster-mon [cluster-mon] > Started: [ ims1 ] > Stopped: [ cluster-mon:0 ] > > Failed actions: > ims:0_demote_0 (node=ims0, call=3179, rc=7, status=complete): not > running > > Stoppping openais service on ims0 completely did the thing. > > Could someone provide me with a hint, what to do ? > - provide more information (logs, ocf script) ? > - change something in configuration ? > - change the environment / versions ? > > Thanks a lot > > Martin Gazak > > > Software versions: > ------------------ > libpacemaker3-1.1.7-42.1 > pacemaker-1.1.7-42.1 > corosync-1.4.3-21.1 > libcorosync4-1.4.3-21.1 > SUSE Linux Enterprise Server 11 (x86_64) > VERSION = 11 > PATCHLEVEL = 2 > > > > Configuration: > -------------- > node ims0 \ > attributes standby="off" > node ims1 \ > attributes standby="off" > primitive cluster-mon ocf:pacemaker:ClusterMon \ > params htmlfile="/opt/ims/tomcat/webapps/ims/html/crm_status.html" \ > op monitor interval="10" > primitive ims ocf:microstepmis:imsMS \ > op monitor interval="1" role="Master" timeout="20" \ > op monitor interval="2" role="Slave" timeout="20" \ > op start interval="0" timeout="1800s" \ > op stop interval="0" timeout="120s" \ > op promote interval="0" timeout="180s" \ > meta failure-timeout="360s" > primitive ims-ip ocf:heartbeat:IPaddr2 \ > params ip="192.168.141.13" nic="bond1" iflabel="ims" > cidr_netmask="24" \ > op monitor interval="15s" \ > meta failure-timeout="60s" > primitive ims-ip-src ocf:heartbeat:IPsrcaddr \ > params ipaddress="192.168.141.13" cidr_netmask="24" \ > op monitor interval="15s" \ > meta failure-timeout="60s" > group on-ims-master ims-ip ims-ip-src > ms ms-ims ims \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" target-role="Started" > migration-threshold="1" > clone clone-cluster-mon cluster-mon > colocation ims_master inf: on-ims-master ms-ims:Master > order ms-ims-before inf: ms-ims:promote on-ims-master:start > property $id="cib-bootstrap-options" \ > dc-version="1.1.7-61a079313275f3e9d0e85671f62c721d32ce3563" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > no-quorum-policy="ignore" \ > stonith-enabled="false" \ > cluster-recheck-interval="1m" \ > default-resource-stickiness="1000" \ > last-lrm-refresh="1372951736" \ > maintenance-mode="false" > > > corosync.log from ims0: > ----------------------- > Jul 04 23:45:02 ims0 crmd: [3935]: info: process_lrm_event: LRM > operation ims:0_monitor_1000 (call=3046, rc=7, cib-update=6229, > confirmed=false) not running > Jul 04 23:45:02 ims0 crmd: [3935]: info: process_graph_event: Detected > action ims:0_monitor_1000 from a different transition: 4024 vs. 4035 > Jul 04 23:45:02 ims0 crmd: [3935]: info: abort_transition_graph: > process_graph_event:476 - Triggered transition abort (complete=1, > tag=lrm_rsc_op, id=ims:0_last_failure_0, > magic=0:7;7:4024:8:e3f096a7-4eb5-4810-9310-eb144f595e20, cib=0.717.6) : > Old event > Jul 04 23:45:02 ims0 crmd: [3935]: WARN: update_failcount: Updating > failcount for ims:0 on ims0 after failed monitor: rc=7 (update=value++, > time=1372952702) > Jul 04 23:45:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC > cause=C_FSA_INTERNAL origin=abort_transition_graph ] > Jul 04 23:45:02 ims0 attrd: [3932]: notice: attrd_trigger_update: > Sending flush op to all hosts for: fail-count-ims:0 (1) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: unpack_config: On loss of > CCM Quorum: Ignore > Jul 04 23:45:02 ims0 pengine: [3933]: WARN: unpack_rsc_op: Processing > failed op ims:0_last_failure_0 on ims0: not running (7) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Recover ims:0 > (Master ims0) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Restart > ims-ip (Started ims0) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Restart > ims-ip-src (Started ims0) > Jul 04 23:45:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > Jul 04 23:45:02 ims0 crmd: [3935]: info: do_te_invoke: Processing graph > 4036 (ref=pe_calc-dc-1372952702-11907) derived from > /var/lib/pengine/pe-input-2819.bz2 > Jul 04 23:45:02 ims0 crmd: [3935]: info: te_rsc_command: Initiating > action 51: stop ims-ip-src_stop_0 on ims0 (local) > Jul 04 23:45:02 ims0 attrd: [3932]: notice: attrd_perform_update: Sent > update 4439: fail-count-ims:0=1 > Jul 04 23:45:02 ims0 attrd: [3932]: notice: attrd_trigger_update: > Sending flush op to all hosts for: last-failure-ims:0 (1372952702) > Jul 04 23:45:02 ims0 lrmd: [3931]: info: cancel_op: operation > monitor[3049] on ims-ip-src for client 3935, its parameters: > CRM_meta_name=[monitor] cidr_netmask=[24] crm_feature_set=[3.0.6] > CRM_meta_timeout=[20000] CRM_meta_interval=[15000] > ipaddress=[192.168.141.13] cancelled > Jul 04 23:45:02 ims0 attrd: [3932]: notice: attrd_perform_update: Sent > update 4441: last-failure-ims:0=1372952702 > Jul 04 23:45:02 ims0 lrmd: [3931]: info: rsc:ims-ip-src stop[3052] (pid > 12111) > Jul 04 23:45:02 ims0 crmd: [3935]: info: abort_transition_graph: > te_update_diff:176 - Triggered transition abort (complete=0, tag=nvpair, > id=status-ims0-fail-count-ims.0, name=fail-count-ims:0, value=1, > magic=NA, cib=0.717.7) : Transient attribute: update > Jul 04 23:45:02 ims0 crmd: [3935]: info: abort_transition_graph: > te_update_diff:176 - Triggered transition abort (complete=0, tag=nvpair, > id=status-ims0-last-failure-ims.0, name=last-failure-ims:0, > value=1372952702, magic=NA, cib=0.717.8) : Transient attribute: update > Jul 04 23:45:02 ims0 crmd: [3935]: info: process_lrm_event: LRM > operation ims-ip-src_monitor_15000 (call=3049, status=1, cib-update=0, > confirmed=true) Cancelled > Jul 04 23:45:02 ims0 pengine: [3933]: notice: process_pe_message: > Transition 4036: PEngine Input stored in: /var/lib/pengine/pe-input-2819.bz2 > Jul 04 23:45:02 ims0 lrmd: [3931]: info: operation stop[3052] on > ims-ip-src for client 3935: pid 12111 exited with return code 0 > Jul 04 23:45:02 ims0 crmd: [3935]: info: process_lrm_event: LRM > operation ims-ip-src_stop_0 (call=3052, rc=0, cib-update=6231, > confirmed=true) ok > Jul 04 23:45:02 ims0 crmd: [3935]: notice: run_graph: ==== Transition > 4036 (Complete=3, Pending=0, Fired=0, Skipped=32, Incomplete=19, > Source=/var/lib/pengine/pe-input-2819.bz2): Stopped > Jul 04 23:45:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC > cause=C_FSA_INTERNAL origin=notify_crmd ] > Jul 04 23:45:02 ims0 pengine: [3933]: notice: unpack_config: On loss of > CCM Quorum: Ignore > Jul 04 23:45:02 ims0 pengine: [3933]: notice: get_failcount: Failcount > for ms-ims on ims0 has expired (limit was 360s) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: unpack_rsc_op: Clearing > expired failcount for ims:0 on ims0 > Jul 04 23:45:02 ims0 pengine: [3933]: notice: get_failcount: Failcount > for ms-ims on ims0 has expired (limit was 360s) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: unpack_rsc_op: Clearing > expired failcount for ims:0 on ims0 > Jul 04 23:45:02 ims0 pengine: [3933]: WARN: unpack_rsc_op: Processing > failed op ims:0_last_failure_0 on ims0: not running (7) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: get_failcount: Failcount > for ms-ims on ims0 has expired (limit was 360s) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: get_failcount: Failcount > for ms-ims on ims0 has expired (limit was 360s) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Recover ims:0 > (Master ims0) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Restart > ims-ip (Started ims0) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Start > ims-ip-src (ims0) > Jul 04 23:45:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > Jul 04 23:45:02 ims0 crmd: [3935]: info: do_te_invoke: Processing graph > 4037 (ref=pe_calc-dc-1372952702-11909) derived from > /var/lib/pengine/pe-input-2820.bz2 > Jul 04 23:45:02 ims0 crmd: [3935]: info: te_crm_command: Executing > crm-event (3): clear_failcount on ims0 > Jul 04 23:45:02 ims0 crmd: [3935]: info: te_rsc_command: Initiating > action 49: stop ims-ip_stop_0 on ims0 (local) > Jul 04 23:45:02 ims0 lrmd: [3931]: info: cancel_op: operation > monitor[3047] on ims-ip for client 3935, its parameters: > cidr_netmask=[24] nic=[bond1] crm_feature_set=[3.0.6] > ip=[192.168.141.13] iflabel=[ims] CRM_meta_name=[monitor] > CRM_meta_timeout=[20000] CRM_meta_interval=[15000] cancelled > Jul 04 23:45:02 ims0 lrmd: [3931]: info: rsc:ims-ip stop[3053] (pid 12154) > Jul 04 23:45:02 ims0 crmd: [3935]: info: process_lrm_event: LRM > operation ims-ip_monitor_15000 (call=3047, status=1, cib-update=0, > confirmed=true) Cancelled > Jul 04 23:45:02 ims0 crmd: [3935]: info: te_rsc_command: Initiating > action 72: notify ims:0_pre_notify_demote_0 on ims0 (local) > Jul 04 23:45:02 ims0 lrmd: [3931]: info: rsc:ims:0 notify[3054] (pid 12155) > Jul 04 23:45:02 ims0 crmd: [3935]: info: te_rsc_command: Initiating > action 74: notify ims:1_pre_notify_demote_0 on ims1 > Jul 04 23:45:02 ims0 lrmd: [3931]: info: operation notify[3054] on ims:0 > for client 3935: pid 12155 exited with return code 0 > Jul 04 23:45:02 ims0 crmd: [3935]: info: process_lrm_event: LRM > operation ims:0_notify_0 (call=3054, rc=0, cib-update=0, confirmed=true) ok > Jul 04 23:45:02 ims0 pengine: [3933]: notice: process_pe_message: > Transition 4037: PEngine Input stored in: /var/lib/pengine/pe-input-2820.bz2 > Jul 04 23:45:02 ims0 lrmd: [3931]: info: RA output: (ims-ip:stop:stderr) > 2013/07/04_23:45:02 INFO: IP status = ok, IP_CIP= > > Jul 04 23:45:02 ims0 lrmd: [3931]: info: operation stop[3053] on ims-ip > for client 3935: pid 12154 exited with return code 0 > Jul 04 23:45:02 ims0 crmd: [3935]: info: process_lrm_event: LRM > operation ims-ip_stop_0 (call=3053, rc=0, cib-update=6233, > confirmed=true) ok > Jul 04 23:45:02 ims0 crmd: [3935]: info: handle_failcount_op: Removing > failcount for ims:0 > Jul 04 23:45:02 ims0 attrd: [3932]: notice: attrd_trigger_update: > Sending flush op to all hosts for: fail-count-ims:0 (<null>) > Jul 04 23:45:02 ims0 cib: [3929]: info: cib_process_request: Operation > complete: op cib_delete for section > //node_state[@uname='ims0']//lrm_resource[@id='ims:0']/lrm_rsc_op[@id='ims:0_last_failure_0'] > (origin=local/crmd/6234, version=0.717.11): ok (rc=0) > Jul 04 23:45:02 ims0 crmd: [3935]: info: abort_transition_graph: > te_update_diff:321 - Triggered transition abort (complete=0, > tag=lrm_rsc_op, id=ims:0_last_failure_0, > magic=0:7;7:4024:8:e3f096a7-4eb5-4810-9310-eb144f595e20, cib=0.717.11) : > Resource op removal > Jul 04 23:45:02 ims0 attrd: [3932]: notice: attrd_perform_update: Sent > delete 4443: node=ims0, attr=fail-count-ims:0, id=<n/a>, set=(null), > section=status > Jul 04 23:45:02 ims0 crmd: [3935]: info: abort_transition_graph: > te_update_diff:194 - Triggered transition abort (complete=0, > tag=transient_attributes, id=ims0, magic=NA, cib=0.717.12) : Transient > attribute: removal > Jul 04 23:45:02 ims0 attrd: [3932]: notice: attrd_trigger_update: > Sending flush op to all hosts for: last-failure-ims:0 (<null>) > Jul 04 23:45:02 ims0 attrd: [3932]: notice: attrd_perform_update: Sent > delete 4445: node=ims0, attr=last-failure-ims:0, id=<n/a>, set=(null), > section=status > Jul 04 23:45:02 ims0 crmd: [3935]: info: abort_transition_graph: > te_update_diff:194 - Triggered transition abort (complete=0, > tag=transient_attributes, id=ims0, magic=NA, cib=0.717.13) : Transient > attribute: removal > Jul 04 23:45:02 ims0 crmd: [3935]: notice: run_graph: ==== Transition > 4037 (Complete=7, Pending=0, Fired=0, Skipped=28, Incomplete=19, > Source=/var/lib/pengine/pe-input-2820.bz2): Stopped > Jul 04 23:45:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC > cause=C_FSA_INTERNAL origin=notify_crmd ] > Jul 04 23:45:02 ims0 pengine: [3933]: notice: unpack_config: On loss of > CCM Quorum: Ignore > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Start > ims-ip (ims0) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: LogActions: Start > ims-ip-src (ims0) > Jul 04 23:45:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > Jul 04 23:45:02 ims0 crmd: [3935]: info: do_te_invoke: Processing graph > 4038 (ref=pe_calc-dc-1372952702-11915) derived from > /var/lib/pengine/pe-input-2821.bz2 > Jul 04 23:45:02 ims0 crmd: [3935]: info: te_rsc_command: Initiating > action 47: start ims-ip_start_0 on ims0 (local) > Jul 04 23:45:02 ims0 lrmd: [3931]: info: rsc:ims-ip start[3055] (pid 12197) > Jul 04 23:45:02 ims0 pengine: [3933]: notice: process_pe_message: > Transition 4038: PEngine Input stored in: /var/lib/pengine/pe-input-2821.bz2 > Jul 04 23:45:02 ims0 lrmd: [3931]: info: RA output: > (ims-ip:start:stderr) 2013/07/04_23:45:02 INFO: Adding IPv4 address > 192.168.141.13/24 with broadcast address 192.168.141.255 to device bond1 > (with label bond1:ims) > > Jul 04 23:45:02 ims0 lrmd: [3931]: info: RA output: > (ims-ip:start:stderr) 2013/07/04_23:45:02 INFO: Bringing device bond1 up > > Jul 04 23:45:02 ims0 lrmd: [3931]: info: RA output: > (ims-ip:start:stderr) 2013/07/04_23:45:02 INFO: > /usr/lib64/heartbeat/send_arp -i 200 -r 5 -p > /var/run/resource-agents/send_arp-192.168.141.13 bond1 192.168.141.13 > auto not_used not_used > > Jul 04 23:45:02 ims0 lrmd: [3931]: info: operation start[3055] on ims-ip > for client 3935: pid 12197 exited with return code 0 > Jul 04 23:45:02 ims0 crmd: [3935]: info: process_lrm_event: LRM > operation ims-ip_start_0 (call=3055, rc=0, cib-update=6236, > confirmed=true) ok > Jul 04 23:45:02 ims0 crmd: [3935]: info: te_rsc_command: Initiating > action 48: monitor ims-ip_monitor_15000 on ims0 (local) > Jul 04 23:45:02 ims0 lrmd: [3931]: info: rsc:ims-ip monitor[3056] (pid > 12255) > Jul 04 23:45:02 ims0 crmd: [3935]: info: te_rsc_command: Initiating > action 49: start ims-ip-src_start_0 on ims0 (local) > Jul 04 23:45:02 ims0 lrmd: [3931]: info: rsc:ims-ip-src start[3057] (pid > 12256) > Jul 04 23:45:02 ims0 lrmd: [3931]: info: operation monitor[3056] on > ims-ip for client 3935: pid 12255 exited with return code 0 > Jul 04 23:45:02 ims0 crmd: [3935]: info: process_lrm_event: LRM > operation ims-ip_monitor_15000 (call=3056, rc=0, cib-update=6237, > confirmed=false) ok > Jul 04 23:45:02 ims0 lrmd: [3931]: info: operation start[3057] on > ims-ip-src for client 3935: pid 12256 exited with return code 0 > Jul 04 23:45:02 ims0 crmd: [3935]: info: process_lrm_event: LRM > operation ims-ip-src_start_0 (call=3057, rc=0, cib-update=6238, > confirmed=true) ok > Jul 04 23:45:02 ims0 crmd: [3935]: info: te_rsc_command: Initiating > action 50: monitor ims-ip-src_monitor_15000 on ims0 (local) > Jul 04 23:45:02 ims0 lrmd: [3931]: info: rsc:ims-ip-src monitor[3058] > (pid 12336) > Jul 04 23:45:02 ims0 lrmd: [3931]: info: operation monitor[3058] on > ims-ip-src for client 3935: pid 12336 exited with return code 0 > Jul 04 23:45:02 ims0 crmd: [3935]: info: process_lrm_event: LRM > operation ims-ip-src_monitor_15000 (call=3058, rc=0, cib-update=6239, > confirmed=false) ok > Jul 04 23:45:02 ims0 crmd: [3935]: notice: run_graph: ==== Transition > 4038 (Complete=6, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pengine/pe-input-2821.bz2): Complete > Jul 04 23:45:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > Jul 04 23:46:02 ims0 crmd: [3935]: info: crm_timer_popped: PEngine > Recheck Timer (I_PE_CALC) just popped (60000ms) > Jul 04 23:46:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC > cause=C_TIMER_POPPED origin=crm_timer_popped ] > Jul 04 23:46:02 ims0 crmd: [3935]: info: do_state_transition: Progressed > to state S_POLICY_ENGINE after C_TIMER_POPPED > Jul 04 23:46:02 ims0 pengine: [3933]: notice: unpack_config: On loss of > CCM Quorum: Ignore > Jul 04 23:46:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > Jul 04 23:46:02 ims0 crmd: [3935]: info: do_te_invoke: Processing graph > 4039 (ref=pe_calc-dc-1372952762-11920) derived from > /var/lib/pengine/pe-input-2822.bz2 > Jul 04 23:46:02 ims0 crmd: [3935]: notice: run_graph: ==== Transition > 4039 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pengine/pe-input-2822.bz2): Complete > Jul 04 23:46:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > Jul 04 23:46:02 ims0 pengine: [3933]: notice: process_pe_message: > Transition 4039: PEngine Input stored in: /var/lib/pengine/pe-input-2822.bz2 > Jul 04 23:47:02 ims0 crmd: [3935]: info: crm_timer_popped: PEngine > Recheck Timer (I_PE_CALC) just popped (60000ms) > Jul 04 23:47:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC > cause=C_TIMER_POPPED origin=crm_timer_popped ] > Jul 04 23:47:02 ims0 crmd: [3935]: info: do_state_transition: Progressed > to state S_POLICY_ENGINE after C_TIMER_POPPED > Jul 04 23:47:02 ims0 pengine: [3933]: notice: unpack_config: On loss of > CCM Quorum: Ignore > Jul 04 23:47:02 ims0 pengine: [3933]: notice: process_pe_message: > Transition 4040: PEngine Input stored in: /var/lib/pengine/pe-input-2822.bz2 > Jul 04 23:47:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > Jul 04 23:47:02 ims0 crmd: [3935]: info: do_te_invoke: Processing graph > 4040 (ref=pe_calc-dc-1372952822-11921) derived from > /var/lib/pengine/pe-input-2822.bz2 > Jul 04 23:47:02 ims0 crmd: [3935]: notice: run_graph: ==== Transition > 4040 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pengine/pe-input-2822.bz2): Complete > Jul 04 23:47:02 ims0 crmd: [3935]: notice: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > > corosync.log from ims1: > ----------------------- > Jul 04 23:45:02 ims1 lrmd: [3913]: info: rsc:ims:1 notify[1424] (pid 25381) > Jul 04 23:45:02 ims1 lrmd: [3913]: info: operation notify[1424] on ims:1 > for client 3917: pid 25381 exited with return code 0 > Jul 04 23:45:02 ims1 crmd: [3917]: info: process_lrm_event: LRM > operation ims:1_notify_0 (call=1424, rc=0, cib-update=0, confirmed=true) ok > Jul 04 23:49:35 ims1 cib: [3911]: info: cib_stats: Processed 324 > operations (92.00us average, 0% utilization) in the last 10min > Jul 04 23:59:35 ims1 cib: [3911]: info: cib_stats: Processed 295 > operations (67.00us average, 0% utilization) in the last 10min > Jul 05 00:00:03 ims1 crmd: [3917]: info: process_lrm_event: LRM > operation ims:1_monitor_2000 (call=1423, rc=7, cib-update=778, > confirmed=false) not running > Jul 05 00:00:03 ims1 attrd: [3914]: notice: attrd_ais_dispatch: Update > relayed from ims0 > Jul 05 00:00:03 ims1 attrd: [3914]: notice: attrd_trigger_update: > Sending flush op to all hosts for: fail-count-ims:1 (1) > Jul 05 00:00:03 ims1 attrd: [3914]: notice: attrd_perform_update: Sent > update 2037: fail-count-ims:1=1 > Jul 05 00:00:03 ims1 attrd: [3914]: notice: attrd_ais_dispatch: Update > relayed from ims0 > > > > -- > > Regards, > > Martin Gazak > MicroStep-MIS, spol. s r.o. > System Development Manager > Tel.: +421 2 602 00 128 > Fax: +421 2 602 00 180 > [email protected] > http://www.microstep-mis.com > > > > _______________________________________________ > Pacemaker mailing list: [email protected] > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: [email protected] http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
