Hey guys,
This may be dumb or obvious but it took me a long time to understand why my
pingd with
dampen never got updated! So I think that this may be useful for everybody to
share.
In short:
" You MUST define a monitor interval HIGHER than the the dampen delay. "
An example is better than a long speech :
*WILL WORK*
primitive pingd ocf:pacemaker:ping \
params host_list="10.1.1.1" attempts="5" timeout="2" debug="true"
dampen="50s" \
op monitor interval="60" timeout="60" \
op start interval="0" timeout="90" \
op stop interval="0" timeout="90"
Because Monitor interval > dampen (60 > 50)
*WILL NOT WORK*
primitive pingd ocf:pacemaker:ping \
params host_list="10.1.1.1" attempts="5" timeout="2" debug="true"
dampen="50s" \
op monitor interval="10" timeout="60" \
op start interval="0" timeout="90" \
op stop interval="0" timeout="90"
Because Monitor interval < dampen (10 < 50)
*Explanations*
Why ? Because every 10sec (monitor interval) pacemaker will trigger
attrd_updater
(check your logfiles for [1]). attrd_updater will wait for the dampen time
given as
an argument (-d 50s in our example). However we won't be able to wait so long
because
10sec later, attrd_updater is called again and obviously it resets dampen again
and
again. Then pingd will never reach its dampen value and consequently pacemaker
will
NEVER update pingd (except if you force a reset with attrd_updater -R or if you
modify the CIB). Q.E.D.
Something somewhere should test if dampen is inferior than monitor time. It
could be
checked in the OCF in ping_validate(). But it doesn't seem to be used (note
that their is
code for an interval parameter that doesn't exist in ocf:pacemaker:ping anyway
but comes
from the old ocf:pacemaker:pingd).
Any objections/comments to this deductive reasoning ?
[1] : attrd_updater: [9712]: info: Invoked: attrd_updater -n pingd -v 0 -d 50s
-Thomas
--- a/ping 2010-12-21 16:03:48.000000000 +1000
+++ b/ping 2010-12-21 16:46:45.000000000 +1000
@@ -200,6 +200,13 @@
exit $OCF_ERR_CONFIGURED
fi
+# Check the dampen interval (must be inferior to the monitor interval)
+ if [ "$OCF_RESKEY_dampen" > "$OCF_RESKEY_CRM_meta_interval" ]; then
+ ocf_log err "Invalid dampen value. dampen should be smaller than the monitor interval!"
+ exit $OCF_ERR_CONFIGURED
+ fi
+
+
# Check the host list
if [ "x" = "x$OCF_RESKEY_host_list" ]; then
ocf_log err "Empty host_list. Please specify some nodes to ping"
_______________________________________________
Pacemaker mailing list: [email protected]
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker