Hi, When I investigated another problem, I discovered this phenomenon. If attrd causes process trouble and does not restart, the problem does not occur.
Step1) After start, it causes a monitor error in UmIPaddr twice. Online: [ srv01 srv02 ] Resource Group: UMgroup01 UmVIPcheck (ocf::heartbeat:Dummy): Started srv01 UmIPaddr (ocf::heartbeat:Dummy2): Started srv01 Migration summary: * Node srv02: * Node srv01: UmIPaddr: migration-threshold=10 fail-count=2 Step2) Kill Attrd and Attrd reboots. Online: [ srv01 srv02 ] Resource Group: UMgroup01 UmVIPcheck (ocf::heartbeat:Dummy): Started srv01 UmIPaddr (ocf::heartbeat:Dummy2): Started srv01 Migration summary: * Node srv02: * Node srv01: UmIPaddr: migration-threshold=10 fail-count=2 Step3) It causes a monitor error in UmIPaddr. Online: [ srv01 srv02 ] Resource Group: UMgroup01 UmVIPcheck (ocf::heartbeat:Dummy): Started srv01 UmIPaddr (ocf::heartbeat:Dummy2): Started srv01 Migration summary: * Node srv02: * Node srv01: UmIPaddr: migration-threshold=10 fail-count=1 -----> Fail-count return to the first. The problem is so that attrd disappears fail-count by reboot.(Hash-tables is Lost.) It is a problem very much that the trouble number of times is initialized. I think that there is the following method. method 1)Attrd maintain fail-count as a file in "/var/run" directories and refer. method 2)When attrd started, Attrd communicates with cib and receives fail-count. Is there a better method? Please think about the solution of this problem. Best Regards, Hideo Yamauchi. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker