On Mon, 2019-01-28 at 18:04 +0530, Dileep V Nair wrote: > Hi, > > I am seeing that there is a log entry showing Recheck Timer popped > and the time in pacemaker.log went back in time. After sometime, the > time issue Around the same time the resources also failed over (Slave > became master). Do anyone know why this behavior ? > > Jan 23 01:16:48 [9383] pn4ushleccp1 lrmd: notice: operation_finished: > db_cp1_monitor_20000:32476:stderr [ /usr/bin/.: Permission denied. ] > Jan 23 01:16:48 [9383] pn4ushleccp1 lrmd: notice: operation_finished: > db_cp1_monitor_20000:32476:stderr [ /usr/bin/.: Permission denied. ] > Jan 22 20:17:03 [9386] pn4ushleccp1 crmd: info: crm_timer_popped: > PEngine Recheck Timer (I_PE_CALC) just popped (900000ms)
Pacemaker can handle the clock jumping forward, but not backward. The recheck timer here is unrelated to the clock jump, it's just the first log message to appear since it jumped. You definitely want to find out what's changing the clock. If this is at system boot, likely the hardware clock is wrong and some time manager (ntp, etc.) is adjusting it. Pacemaker's systemd unit file has "After=time-sync.target" to try to ensure that it doesn't start until after this has happened, but unfortunately you often have to take extra steps to make time managers use that target (e.g. enable chronyd- wait.service if you're using chronyd), and of course if you're not using systemd it's not any help. But the basic idea is you want to ensure pacemaker starts after the time has been adjusted at boot. If this isn't at boot, then your host has something weird going on. Check the system log around the time of the jump, etc. > Jan 22 20:17:03 [9386] pn4ushleccp1 crmd: notice: > do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE | > input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped > Jan 22 20:17:03 [9386] pn4ushleccp1 crmd: info: do_state_transition: > Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > process_pe_message: Input has not changed since last time, not saving > to disk > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: notice: unpack_config: > Relying on watchdog integration for fencing > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > determine_online_status_fencing: Node pn4us7leccp1 is active > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > determine_online_status: Node pn4us7leccp1 is online > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > determine_online_status_fencing: Node pn4ushleccp1 is active > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > determine_online_status: Node pn4ushleccp1 is online > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > determine_op_status: Operation monitor found resource db_cp1:0 active > on pn4us7leccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > determine_op_status: Operation monitor found resource TSM_DB2 active > on pn4us7leccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > determine_op_status: Operation monitor found resource TSM_DB2 active > on pn4us7leccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > determine_op_status: Operation monitor found resource ip_cp1 active > on pn4ushleccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > determine_op_status: Operation monitor found resource db_cp1:1 active > in master mode on pn4ushleccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > determine_op_status: Operation monitor found resource TSM_DB2log > active on pn4ushleccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: > determine_op_status: Operation monitor found resource KUD_DB2 active > on pn4ushleccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: native_print: > stonith-sbd (stonith:external/sbd): Started pn4ushleccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: native_print: > ip_cp1 (ocf::heartbeat:IPaddr2): Started pn4us7leccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: clone_print: > Master/Slave Set: ms_db2_cp1 [db_cp1] > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: short_print: > Masters: [ pn4us7leccp1 ] > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: short_print: > Slaves: [ pn4ushleccp1 ] > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: native_print: > TSM_DB2 (systemd:dsmcad_db2): Started pn4us7leccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: native_print: > TSM_DB2log (systemd:dsmcad_db2log): Started pn4us7leccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: native_print: > KUD_DB2 (systemd:kuddb2_db2): Started pn4us7leccp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: rsc_merge_weights: > ms_db2_cp1: Breaking dependency loop at ms_db2_cp1 > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: master_color: > Promoting db_cp1:0 (Master pn4us7leccp1) > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: master_color: > ms_db2_cp1: Promoted 1 instances of a possible 1 to master > Jan 22 20:17:03 [9385] pn4ushleccp1 pengine: info: LogActions: Leave > ip_cp1 (Started pn4us7leccp1) > > > After the transition, the date was shifted back to normal > > Jan 22 20:47:03 [9386] pn4ushleccp1 crmd: info: do_log: Input > I_TE_SUCCESS received in state S_TRANSITION_ENGINE from notify_crmd > Jan 22 20:47:03 [9386] pn4ushleccp1 crmd: notice: > do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE | > input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd > Jan 23 01:47:22 [9383] pn4ushleccp1 lrmd: notice: operation_finished: > db_cp1_monitor_20000:19518:stderr [ /usr/bin/.: Permission denied. ] > Jan 23 01:47:22 [9383] pn4ushleccp1 lrmd: notice: operation_finished: > db_cp1_monitor_20000:19518:stderr [ /usr/bin/.: Permission denied. ] > > > > Thanks & Regards > > Dileep Nair > Squad Lead - SAP Base > Togaf Certified Enterprise Architect > IBM Services for Managed Applications > +91 98450 22258 Mobile > dilen...@in.ibm.com > > IBM Services -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org