Hi All,

Here is a small contribution to the monitoring scripts. Feel free to distribute/hack/modify,etc...


heartbeat.monitor README

For use with the Heartbeat package from www.linux-ha.org, this monitoring script checks the resources (or daemons) under Hearbeat's control and complains (using a single summary status line only) if any are not running. The Heartbeat daemon is also monitored, and the script will complain if Heartbeat is not running. The script relies on a standard Heartbeat installation.

Any options sent to this monitoring script are silently ignored. Returns 0 if Heartbeat and all of the resources (as specified in /etc/ha.d/haresources) are running. Returns 1 if Heartbeat or any of the resources are not running.

Note: With a mon startup script that is capable of handling the {start|stop|status} arguments you could tell Heartbeat to start Mon so a failover will cause Mon to start on another (backup) node if the primary node crashes.


Here, then, is the heartbeat.monitor script it's small 'cause all the real brains are already in the Heartbeat package:


--------------------------------------------------


#!/bin/bash
#
#
# heartbeat.monitor
#
# This script (intended for use with the Mon monitoring package and
# the Heartbeat program from www.linux-ha.org) is used to make
# sure the Heartbeat program and all of its resources are still
# running. (This means you have to run Mon on the same machine
# as Heartbeat for this script to work.)
#
# Note: Any arguments sent to this script are silently ignored.
#
#

host=`/bin/uname -n`
resources=`/usr/lib/heartbeat/ResourceManager listkeys $host`
allokay=0
RETVAL=0
downresources=""
HA_DIR=/etc/ha.d; export HA_DIR
CONFIG=$HA_DIR/ha.cf
. $HA_DIR/shellfuncs

#
# Is the Heartbeat daemon running?
#
$HA_BIN/heartbeat -s > /dev/null
if [ $? -ne 0 ];then
allokay=1
downresources="The Heartbeat Daemon"
fi

#
# Are the resources for this host running?
#
for resource in $resources
do

# Are any of the resources not running?

/usr/lib/heartbeat/ResourceManager status $resource > /dev/null
RETVAL=$?

if [ $RETVAL -ne 0 ]; then
allokay=1
downresources=`echo $downresources Resource=$resource`
fi

done

#
# Report any resources that are dead/down.
#
if [ $allokay -ne 0 ]; then
now=`date`
echo "WARNING: $downresources not running on $host at $now"
exit 1
else
exit 0
fi


--------------------------------------------------

--K




_________________________________________________________________
Tired of spam? Get advanced junk mail protection with MSN 8. http://join.msn.com/?page=features/junkmail

_______________________________________________
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to