Hi, Im new to this list so if this has been asked in the past please exuse me and point me in the write direction :)
I have just started using mon, and have found the inital tests seem to be realy good, i'm now looking at using it to monitor all my servers (aprox 20). I have a problem at the moment :( I have tried adapting the alert.template to create a new alert, but it wont run :( I use webmin to setup mon, it wont even see the new script :( I then edited the mon.cf file by hand, and got nowhere ether :( and when the alert is triggered i get nothing showing the new program was called :( my mon.cf is as follows :- cat /etc/mon/mon.cf # # Extremely basic mon.cf file # # # global options # cfbasedir = /etc/mon pidfile = /var/run/mon.pid statedir = /var/run/mon/state.d logdir = /var/run/mon/log.d dtlogfile = /var/run/mon/log.d/downtime.log alertdir = /usr/lib/mon/alert.d mondir = /usr/lib/mon/mon.d maxprocs = 20 histlength = 100 randstart = 60s authtype = userfile userfile = /etc/mon/userfile # # group definitions (hostnames or IP addresses) # hostgroup servers localhost watch servers service ping interval 5m monitor fping.monitor period wd {Mon-Fri} hr {7am-10pm} alert mail.alert [EMAIL PROTECTED] alertevery 1h period wd {Sat-Sun} alert mail.alert [EMAIL PROTECTED] service http interval 4m monitor http.monitor allow_empty_group period wd {Sun-Sat} upalert mail.alert -S "web server is back up" [EMAIL PROTECTED] alertevery 45m service smtp interval 10m monitor smtp.monitor period wd {Mon-Fri} hr {7am-10pm} alertevery 1h alertafter 2 30m alert qpage.alert [EMAIL PROTECTED] service pop3 interval 30s monitor pop3.monitor period alert alert.reboot # See /usr/doc for the original example... the script i have writen is in : pwd /usr/lib/mon/alert.d -rwxr-xr-x 1 root root 1911 Jul 22 23:32 alert.reboot and the script is as follows :- #!/usr/bin/perl # # Reboot alert system # Matt Lowe [EMAIL PROTECTED] # Created from template by # # Jim Trocki, [EMAIL PROTECTED] # # $Id: alert.template 1.1 Sat, 26 Aug 2000 15:22:34 -0400 trockij $ # # Copyright (C) 1998, Jim Trocki # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # use Getopt::Std; getopts ("s:g:h:t:l:u"); #entered by [EMAIL PROTECTED] open LOG_FILE, ">>/var/log/mon"; select LOG_FILE; # # the first line is summary information, adequate to send to a pager # or email subject line # # # the following lines normally contain more detailed information, # but this is monitor-dependent # # see the "Alert Programs" section in mon(1) for an explanation # of the options that are passed to the monitor script. # $summary=<STDIN>; chomp $summary; $t = localtime($opt_t); ($wday,$mon,$day,$tm) = split (/\s+/, $t); print <<EOF; Alert for group $opt_g, service $opt_s EOF print "This alert was sent because service was restored\n" if ($opt_u); print <<EOF; This happened on $wday $mon $day $tm Summary information: $summary Arguments passed to this script: @ARGV Detailed information follows: EOF while (<STDIN>) { print; } if ($MON_ALERTTYPE == 'failure') { system("/sbin/shutdown -r 1"); } (sorry for the large post) The program is ment to output standard out to a log file, and reboot the system. I know this reboot might seem extream but the server it is being designed for is very unstable (its only got to last another 2 weeks or so, before replacment). But the routine would be helpfull on a couple of servers i have as 'worst case' problem solvers :) Also i'd like to ask if anyone has writen an alert that shuts down a service and starts it backup again? something like :- Alert type: service_restart Extra Params : "service_shutdown", "service_startup", pause would execute somthing like: system ('service',$service_shutdown); sleep $pause; system ('service',$service_startup); this could then very esaly be made to handle restarting any number of services on a linux box, without having to write an individual alert for each service thanks for any help Matt Lowe mon <at> mlsis.org
_______________________________________________ mon mailing list [EMAIL PROTECTED] http://linux.kernel.org/mailman/listinfo/mon