Hi,
Im new to this list so if this has been asked in the past please exuse me and point me 
in the write direction :)

I have just started using mon, and have found the inital tests seem to be realy good, 
i'm now looking at using it to monitor all my servers (aprox 20).

I have a problem at the moment :(
I have tried adapting the alert.template to create a new alert, but it wont run :(
I use webmin to setup mon, it wont even see the new script :(

I then edited the mon.cf file by hand, and got nowhere ether :(

and when the alert is triggered i get nothing showing the new program was called :(

my mon.cf is as follows :-
cat /etc/mon/mon.cf
#
# Extremely basic mon.cf file
#
#
# global options
#
cfbasedir   = /etc/mon
pidfile     = /var/run/mon.pid
statedir    = /var/run/mon/state.d
logdir      = /var/run/mon/log.d
dtlogfile   = /var/run/mon/log.d/downtime.log
alertdir    = /usr/lib/mon/alert.d
mondir      = /usr/lib/mon/mon.d
maxprocs    = 20
histlength  = 100
randstart   = 60s
authtype    = userfile
userfile    = /etc/mon/userfile

#
# group definitions (hostnames or IP addresses)
#
hostgroup servers localhost

watch servers
    service ping
        interval 5m
        monitor fping.monitor
        period wd {Mon-Fri} hr {7am-10pm}
            alert mail.alert [EMAIL PROTECTED]
            alertevery 1h
        period wd {Sat-Sun}
            alert mail.alert [EMAIL PROTECTED]
   service http
       interval 4m
       monitor http.monitor
       allow_empty_group
       period wd {Sun-Sat}
           upalert mail.alert -S "web server is back up" [EMAIL PROTECTED]
           alertevery 45m
    service smtp
        interval 10m
        monitor smtp.monitor
        period wd {Mon-Fri} hr {7am-10pm}
            alertevery 1h
            alertafter 2 30m
            alert qpage.alert [EMAIL PROTECTED]
    service pop3
        interval 30s
        monitor pop3.monitor
        period
            alert alert.reboot

# See /usr/doc for the original example...

the script i have writen is in :
pwd
/usr/lib/mon/alert.d
-rwxr-xr-x    1 root     root         1911 Jul 22 23:32 alert.reboot

and the script is as follows :-
#!/usr/bin/perl
#
# Reboot alert system
# Matt Lowe [EMAIL PROTECTED]
# Created from template by
#
# Jim Trocki, [EMAIL PROTECTED]
#
# $Id: alert.template 1.1 Sat, 26 Aug 2000 15:22:34 -0400 trockij $
#
#    Copyright (C) 1998, Jim Trocki
#
#    This program is free software; you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation; either version 2 of the License, or
#    (at your option) any later version.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License
#    along with this program; if not, write to the Free Software
#    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#
use Getopt::Std;
getopts ("s:g:h:t:l:u");

#entered by [EMAIL PROTECTED]
open LOG_FILE, ">>/var/log/mon";
select LOG_FILE;
#
# the first line is summary information, adequate to send to a pager
# or email subject line
#
#
# the following lines normally contain more detailed information,
# but this is monitor-dependent
#
# see the "Alert Programs" section in mon(1) for an explanation
# of the options that are passed to the monitor script.
#
$summary=<STDIN>;
chomp $summary;

$t = localtime($opt_t);
($wday,$mon,$day,$tm) = split (/\s+/, $t);

print <<EOF;

Alert for group $opt_g, service $opt_s
EOF

print "This alert was sent because service was restored\n"
    if ($opt_u);

print <<EOF;
This happened on $wday $mon $day $tm
Summary information: $summary
Arguments passed to this script: @ARGV
Detailed information follows:

EOF

while (<STDIN>) {
    print;
}
if ($MON_ALERTTYPE == 'failure') {
        system("/sbin/shutdown -r 1");
        }

(sorry for the large post)
The program is ment to output standard out to a log file, and reboot the system.
I know this reboot might seem extream but the server it is being designed for is very 
unstable (its only got to last another 2 weeks or so, before replacment).
But the routine would be helpfull on a couple of servers i have as 'worst case' 
problem solvers :)

Also i'd like to ask if anyone has writen an alert that shuts down a service and 
starts it backup again?
something like :-
Alert type: service_restart
Extra Params : "service_shutdown", "service_startup", pause

would execute somthing like:
system ('service',$service_shutdown);
sleep $pause;
system ('service',$service_startup);

this could then very esaly be made to handle restarting any number of services on a 
linux box, without having to write an individual alert for each service

thanks for any help

Matt Lowe
mon <at> mlsis.org
_______________________________________________
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to