Thanks Ralf,
Mon now calls the alert.reboot program :)
all i changed was the period line.
The problem i have now is working out what the status is of the system,
the way i understood the man pages, there should be a varable avilable to the alert 
program e.g. $MON_ALERTTYPE which is set to 'failure' or 'up' to signify what the 
alert is about
i have :
if ($MON_ALERTTYPE == 'failure') {
         system("/sbin/shutdown -r 1");
         }

i have in the alert script :
print <<EOF;
This happened on $wday $mon $day $tm
Summary information: $summary
Arguments passed to this script: @ARGV
Alert Type: $MON_ALERTTYPE
Detailed information follows:

EOF

this shows NOTHING for the contents of $MON_ALERTTYPE
Am i using the wrong var, or am i missing some other setting?
(full script at end of post)

Thanks
Matt Lowe
mon <at> mlsis.org


"Ralf Roeber" <[EMAIL PROTECTED]> wrote ..
> Maybe it helps to set a period
> 
> >     service pop3
> >         interval 30s
> >         monitor pop3.monitor
> >         period
> 
> should read like this ?
>          period wd {Mon-Sun} hr {0am-12pm}
> 
> >             alert alert.reboot
> 
> With an error msg it's hard to help. Have you had a look
> in the logfiles?
> 
> 
> 
> ----- Original Message -----
> From: "Matt Lowe" <[EMAIL PROTECTED]>
> To: "Mon Mailing List" <[EMAIL PROTECTED]>
> Sent: Wednesday, July 23, 2003 1:02 AM
> Subject: newbie question
> 
> 
> > Hi,
> > Im new to this list so if this has been asked in the past please exuse
> me
> and point me in the write direction :)
> >
> > I have just started using mon, and have found the inital tests seem to
> be
> realy good, i'm now looking at using it to monitor all my servers (aprox
> 20).
> >
> > I have a problem at the moment :(
> > I have tried adapting the alert.template to create a new alert, but it
> wont run :(
> > I use webmin to setup mon, it wont even see the new script :(
> >
> > I then edited the mon.cf file by hand, and got nowhere ether :(
> >
> > and when the alert is triggered i get nothing showing the new program
> was
> called :(
> >
> > my mon.cf is as follows :-
> > cat /etc/mon/mon.cf
> > #
> > # Extremely basic mon.cf file
> > #
> > #
> > # global options
> > #
> > cfbasedir   = /etc/mon
> > pidfile     = /var/run/mon.pid
> > statedir    = /var/run/mon/state.d
> > logdir      = /var/run/mon/log.d
> > dtlogfile   = /var/run/mon/log.d/downtime.log
> > alertdir    = /usr/lib/mon/alert.d
> > mondir      = /usr/lib/mon/mon.d
> > maxprocs    = 20
> > histlength  = 100
> > randstart   = 60s
> > authtype    = userfile
> > userfile    = /etc/mon/userfile
> >
> > #
> > # group definitions (hostnames or IP addresses)
> > #
> > hostgroup servers localhost
> >
> > watch servers
> >     service ping
> >         interval 5m
> >         monitor fping.monitor
> >         period wd {Mon-Fri} hr {7am-10pm}
> >             alert mail.alert [EMAIL PROTECTED]
> >             alertevery 1h
> >         period wd {Sat-Sun}
> >             alert mail.alert [EMAIL PROTECTED]
> >    service http
> >        interval 4m
> >        monitor http.monitor
> >        allow_empty_group
> >        period wd {Sun-Sat}
> >            upalert mail.alert -S "web server is back up"
> [EMAIL PROTECTED]
> >            alertevery 45m
> >     service smtp
> >         interval 10m
> >         monitor smtp.monitor
> >         period wd {Mon-Fri} hr {7am-10pm}
> >             alertevery 1h
> >             alertafter 2 30m
> >             alert qpage.alert [EMAIL PROTECTED]
> >     service pop3
> >         interval 30s
> >         monitor pop3.monitor
> >         period
> >             alert alert.reboot
> >
> > # See /usr/doc for the original example...
> >
> > the script i have writen is in :
> > pwd
> > /usr/lib/mon/alert.d
> > -rwxr-xr-x    1 root     root         1911 Jul 22 23:32 alert.reboot
> >
> > and the script is as follows :-
> > #!/usr/bin/perl
> > #
> > # Reboot alert system
> > # Matt Lowe [EMAIL PROTECTED]
> > # Created from template by
> > #
> > # Jim Trocki, [EMAIL PROTECTED]
> > #
> > # $Id: alert.template 1.1 Sat, 26 Aug 2000 15:22:34 -0400 trockij $
> > #
> > #    Copyright (C) 1998, Jim Trocki
> > #
> > #    This program is free software; you can redistribute it and/or modify
> > #    it under the terms of the GNU General Public License as published
> by
> > #    the Free Software Foundation; either version 2 of the License, or
> > #    (at your option) any later version.
> > #
> > #    This program is distributed in the hope that it will be useful,
> > #    but WITHOUT ANY WARRANTY; without even the implied warranty of
> > #    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > #    GNU General Public License for more details.
> > #
> > #    You should have received a copy of the GNU General Public License
> > #    along with this program; if not, write to the Free Software
> > #    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307
> USA
> > #
> > use Getopt::Std;
> > getopts ("s:g:h:t:l:u");
> >
> > #entered by [EMAIL PROTECTED]
> > open LOG_FILE, ">>/var/log/mon";
> > select LOG_FILE;
> > #
> > # the first line is summary information, adequate to send to a pager
> > # or email subject line
> > #
> > #
> > # the following lines normally contain more detailed information,
> > # but this is monitor-dependent
> > #
> > # see the "Alert Programs" section in mon(1) for an explanation
> > # of the options that are passed to the monitor script.
> > #
> > $summary=<STDIN>;
> > chomp $summary;
> >
> > $t = localtime($opt_t);
> > ($wday,$mon,$day,$tm) = split (/\s+/, $t);
> >
> > print <<EOF;
> >
> > Alert for group $opt_g, service $opt_s
> > EOF
> >
> > print "This alert was sent because service was restored\n"
> >     if ($opt_u);
> >
> > print <<EOF;
> > This happened on $wday $mon $day $tm
> > Summary information: $summary
> > Arguments passed to this script: @ARGV
> > Detailed information follows:
> >
> > EOF
> >
> > while (<STDIN>) {
> >     print;
> > }
> > if ($MON_ALERTTYPE == 'failure') {
> >         system("/sbin/shutdown -r 1");
> >         }
> >
> > (sorry for the large post)
> > The program is ment to output standard out to a log file, and reboot
> the
> system.
> > I know this reboot might seem extream but the server it is being designed
> for is very unstable (its only got to last another 2 weeks or so, before
> replacment).
> > But the routine would be helpfull on a couple of servers i have as 'worst
> case' problem solvers :)
> >
> > Also i'd like to ask if anyone has writen an alert that shuts down a
> service and starts it backup again?
> > something like :-
> > Alert type: service_restart
> > Extra Params : "service_shutdown", "service_startup", pause
> >
> > would execute somthing like:
> > system ('service',$service_shutdown);
> > sleep $pause;
> > system ('service',$service_startup);
> >
> > this could then very esaly be made to handle restarting any number of
> services on a linux box, without having to write an individual alert for
> each service
> >
> > thanks for any help
> >
> > Matt Lowe
> > mon <at> mlsis.org
> >
> 
> 
> ----------------------------------------------------------------------------
> ----
> 
> 
> > _______________________________________________
> > mon mailing list
> > [EMAIL PROTECTED]
> > http://linux.kernel.org/mailman/listinfo/mon
> >
> 
> 
> -- 
> This message has been scanned for viruses and
> dangerous content by the
> www.ms.mlsis.co.uk MailScanner, and is
> believed to be clean.
> Please contact [EMAIL PROTECTED] for support.
_______________________________________________
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to