Hi All, I submitted a problem in next bugziila in the past. * https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2501
A similar phenomenon is generated in attrd of latest Pacemaker. Step 1) Set the setting of the cluster as follows. export PCMK_fail_fast=no Step 2) Start a cluster. Step 3) Cause trouble in a resource and improve a trouble count.(fail-count) -------------------------------- [root@srv01 ~]# crm_mon -1 -Af (snip) Online: [ srv01 ] before-dummy (ocf::heartbeat:Dummy): Started srv01 vip-master (ocf::heartbeat:Dummy2): Started srv01 Migration summary: * Node srv01: before-dummy: migration-threshold=10 fail-count=1 last-failure='Mon Jun 9 19:21:07 2014' Failed actions: before-dummy_monitor_10000 on srv01 'not running' (7): call=11, status=complete, last-rc-change='Mon Jun 9 19:21:07 2014', queued=0ms, exec=0ms -------------------------------- Step 4) Reboot attrd in kill.(I assume that attrd breaks down and rebooted.) Step 5) Produce trouble in a resource same as step 3 again. * The trouble number(fail-count) of times returns to 1. -------------------------------- [root@srv01 ~]# crm_mon -1 -Af (snip) Online: [ srv01 ] before-dummy (ocf::heartbeat:Dummy): Started srv01 vip-master (ocf::heartbeat:Dummy2): Started srv01 Migration summary: * Node srv01: before-dummy: migration-threshold=10 fail-count=1 last-failure='Mon Jun 9 19:22:47 2014' Failed actions: before-dummy_monitor_10000 on srv01 'not running' (7): call=17, status=complete, last-rc-change='Mon Jun 9 19:22:47 2014', queued=0ms, exec=0ms -------------------------------- Even if attrd reboots, I think that it is necessary to improve attrd so that an attribute is maintained definitely. Best Regards, Hideo Yamauch. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org