Hi,

When I investigated another problem, I discovered this phenomenon.
If attrd causes process trouble and does not restart, the problem does not 
occur.

Step1) After start, it causes a monitor error in UmIPaddr twice.

Online: [ srv01 srv02 ]

 Resource Group: UMgroup01
     UmVIPcheck (ocf::heartbeat:Dummy): Started srv01
     UmIPaddr   (ocf::heartbeat:Dummy2):        Started srv01

Migration summary:
* Node srv02: 
* Node srv01: 
   UmIPaddr: migration-threshold=10 fail-count=2

Step2) Kill Attrd and Attrd reboots.

Online: [ srv01 srv02 ]

 Resource Group: UMgroup01
     UmVIPcheck (ocf::heartbeat:Dummy): Started srv01
     UmIPaddr   (ocf::heartbeat:Dummy2):        Started srv01

Migration summary:
* Node srv02: 
* Node srv01: 
   UmIPaddr: migration-threshold=10 fail-count=2

Step3) It causes a monitor error in UmIPaddr.

Online: [ srv01 srv02 ]

 Resource Group: UMgroup01
     UmVIPcheck (ocf::heartbeat:Dummy): Started srv01
     UmIPaddr   (ocf::heartbeat:Dummy2):        Started srv01

Migration summary:
* Node srv02: 
* Node srv01: 
   UmIPaddr: migration-threshold=10 fail-count=1 -----> Fail-count return to 
the first.

The problem is so that attrd disappears fail-count by reboot.(Hash-tables is 
Lost.)
It is a problem very much that the trouble number of times is initialized.

I think that there is the following method. 

method 1)Attrd maintain fail-count as a file in "/var/run" directories and 
refer.

method 2)When attrd started, Attrd communicates with cib and receives 
fail-count.

Is there a better method?

Please think about the solution of this problem.

Best Regards,
Hideo Yamauchi.


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to