2012-01-10 22:48 Dario Minnucci <mid...@debian.org>: | Hi again Jari, > | I'm unable to download the patch from the URL [0] you've stated. > | Please, would you mind to forward it to me directly or add it again to this bugreport? > | Regards, > > | [0] https://sourceforge.net/tracker/download.php?group_id=170&atid=100170&file_id=432600&aid=3471944
The CVS version is a little newer than one in Debian. The following should work for latest Git verson of mon in Debian, against: 8c06d4b 2012-01-10 debian/copyright: Source URL updated. Jari
>From ee6b08528896bcf80b68d65cea04df7e4a8c7e23 Mon Sep 17 00:00:00 2001 From: Jari Aalto <jari.aa...@cante.net> Date: Wed, 11 Jan 2012 11:23:49 +0200 Subject: [PATCH] doc/mon.8: Order items alphabetically Organization: Private Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Signed-off-by: Jari Aalto <jari.aa...@cante.net> --- doc/mon.8 | 911 +++++++++++++++++++++++++++++-------------------------------- 1 files changed, 430 insertions(+), 481 deletions(-) diff --git a/doc/mon.8 b/doc/mon.8 index 7ec784d..93c4ba6 100644 --- a/doc/mon.8 +++ b/doc/mon.8 @@ -42,7 +42,7 @@ and triggering alerts upon detecting failures. .B mon was designed to be open in the sense that it supports arbitrary monitoring facilities and alert methods via a common interface, which -are easily implemented through programs (in C, Perl, shell, etc.), +are easily implemented through programs (in C, Perl, shell, etc.), SNMP traps, and special Mon (UDP packet) traps. .SH OPTIONS @@ -94,9 +94,6 @@ Fork and run as a daemon process. This is the preferred way to run .BR mon . .TP -.BI \-h -Print help information. -.TP .BI \-i\ secs Sleep interval, in seconds. Defaults to 1. This shouldn't need to be adjusted for any reason. @@ -108,16 +105,16 @@ entries. Defaults to 100. .TP .BI \-l\ statetype -Load state from the last saved state file. The -supported saved state types are +Load state from the last saved state file. The +supported saved state types are .B disabled -for disabled watches, services, and hosts, +for disabled watches, services, and hosts, .B opstatus -for failure/alert/ack status of +for failure/alert/ack status of all services, -and -.B all -for both. If no statetype is provided, +and +.B all +for both. If no statetype is provided, .B disabled is assumed. .TP @@ -179,6 +176,9 @@ by default). .TP .BI \-v Print version information. +.TP +.BI \-h +Print help information. .SH DEFINITIONS .TP @@ -426,32 +426,40 @@ and directories if not specified. They are invoked with the following command-line parameters: -.TP -.BI \-s\ service -Service tag from the configuration file. + .TP .BI \-g\ group Host group name from the configuration file. + .TP .BI \-h\ hosts The expanded version of the host group, space delimited, but contained in one shell "word". + .TP .BI \-l\ alertevery The number of seconds until the next alarm will be sent. + .TP .BI \-O This option is supplied to an alert only if the alert is being generated as a result of an expected traap timing out + +.TP +.BI \-s\ service +Service tag from the configuration file. + .TP .BI \-t\ time The time (in .BR time (2) format) of when this failure condition was detected. + .TP .BI \-T This option is supplied to an alert only if the alert was triggered by a trap + .TP .B \-u This option is supplied to an alert only if it is being @@ -466,26 +474,10 @@ variables defined by the user in the service definition, in addition to the following which are explicitly set by the server: .TP -.B MON_LAST_SUMMARY -The first line of the output from the last time the -monitor exited. - -.TP -.B MON_LAST_OUTPUT -The entire output of the monitor from the last time it -exited. - -.TP -.B MON_LAST_FAILURE -The time(2) of the last failure for this service. - -.TP -.B MON_FIRST_FAILURE -The time(2) of the first time this service failed. - -.TP -.B MON_LAST_SUCCESS -The time(2) of the last time this service passed. +.B MON_ALERTTYPE +Has one of the following values: "failure", "up", "startup", +"trap", or "traptimeout", and signifies the type of alert which +was triggered. .TP .B MON_DESCRIPTION @@ -495,33 +487,30 @@ configuration file using the tag. .TP -.B MON_GROUP -The watch group which triggered this alarm +.B MON_FIRST_FAILURE +The time(2) of the first time this service failed. .TP -.B MON_SERVICE -The service heading which generated this alert +.B MON_GROUP +The watch group which triggered this alarm .TP -.B MON_RETVAL -The exit value of the failed monitor program, or return value -as accepted from a trap. +.B MON_LAST_FAILURE +The time(2) of the last failure for this service. .TP -.B MON_OPSTATUS -The operational status of the service. +.B MON_LAST_OUTPUT +The entire output of the monitor from the last time it +exited. .TP -.B MON_ALERTTYPE -Has one of the following values: "failure", "up", "startup", -"trap", or "traptimeout", and signifies the type of alert which -was triggered. +.B MON_LAST_SUCCESS +The time(2) of the last time this service passed. .TP -.B MON_TRAP_INTENDED -This is only set when an unknown mon trap is received and caught -by the default/defaut watch/service. This contains colon -separated entries of the trap's intended watch group and service name. +.B MON_LAST_SUMMARY +The first line of the output from the last time the +monitor exited. .TP .B MON_LOGDIR @@ -531,6 +520,15 @@ as indicated by the global configuration variable. .TP +.B MON_OPSTATUS +The operational status of the service. + +.TP +.B MON_RETVAL +The exit value of the failed monitor program, or return value +as accepted from a trap. + +.TP .B MON_STATEDIR The directory where state files should be kept, as indicated by the @@ -538,6 +536,16 @@ as indicated by the global configuration variable. .TP +.B MON_SERVICE +The service heading which generated this alert + +.TP +.B MON_TRAP_INTENDED +This is only set when an unknown mon trap is received and caught +by the default/defaut watch/service. This contains colon +separated entries of the trap's intended watch group and service name. + +.TP .B MON_CFBASEDIR The directory where configuration files should be kept, as indicated by the @@ -604,56 +612,6 @@ hash is only generated upon startup or after a "reset" command, so newly added alert scripts will not be recognized until a "reset" is performed. .TP -.BI "mondir = " dir -.I dir -is the full path to the monitor scripts. This value may also be -set by the -.B \-s -command-line parameter. If this path does not begin with a "/", it will be -relative to -.IR basedir . - -Multiple alert paths may be specified by separating them with -a colon. All paths must be absolute. - -When the configuration file is read, all monitors referenced from the -configuration will be looked up in each of these paths, and the -full path to the first -instance of the monitor found is stored in a hash. This hash is only -generated upon startup or after a "reset" command, so newly added monitor -scripts will not be recognized until a "reset" is performed. - -.TP -.BI "statedir = " dir -.I dir -is the full path to the state directory. -.B mon -uses this directory to save various state information. If this path does not begin with a "/", it will be -relative to -.IR basedir . - -.TP -.BI "logdir = " dir -.I dir -is the full path to the log directory. -.B mon -uses this directory to save various logs, including -the downtime log. If this path does not begin with a "/", it will be -relative to -.IR basedir . - -.TP -.BI "basedir = " dir -.I dir -is the full path for the state, log, monitor, and alert directories. - -.TP -.BI "cfbasedir = " dir -.I dir -is the full path where all the config files can be found -(monusers.cf, auth.cf, etc.). - -.TP .BI "authfile = " file .I file is the path to the authentication file. If the path does not begin @@ -661,6 +619,11 @@ with a "/", it will be relative to .IR cfbasedir . .TP +.BI "basedir = " dir +.I dir +is the full path for the state, log, monitor, and alert directories. + +.TP .BI "authtype = " "type [type...]" .I type is the type of authentication to use. A space-separated list of @@ -703,52 +666,62 @@ If .I type is .BR trustlocal , -then if the client connection comes from locahost, the username passed from -the client will be trusted, and the password will be ignored. This can be used -when you want the client to handle the authentication for you. I.e. a CGI script +then if the client connection comes from locahost, the username passed from +the client will be trusted, and the password will be ignored. This can be used +when you want the client to handle the authentication for you. I.e. a CGI script using one of the many apache authentication methods. .TP -.BI "userfile = " file -This file is used when -.B authtype -is set to -.IR userfile . -It consists of a sequence of lines of the format -.BR "'username : password'" . -.B password -is stored as the hash returned by the standard Unix -crypt(3) function. -.B NOTE: -the format of this file is compatible with the Apache file based -username/password file format. It is possible to use the -.I htpasswd -program supplied with Apache to manage the mon userfile. - -Blank lines and lines beginning with # are ignored. +.BI "cfbasedir = " dir +.I dir +is the full path where all the config files can be found +(monusers.cf, auth.cf, etc.). .TP -.BI "pamservice = " service -The PAM service used for authentication. This is applicable -only if "pam" is specified as a parameter to the -.B authtype -setting. If this global is not defined, it defaults -to -.BR "passwd" . +.BI "cltimeout = " secs +Sets the client inactivity timeout to +.I secs. +This is meant to help thwart denial of service attacks or +recover from crashed clients. +.I secs +is interpreted as a "1h/1m/1s" string, where +"1m" = 60 seconds. .TP -.BI "serverbind = " addr +.BI "dep_behavior = " {a|m|hm} +.B dep_behavior +controls whether the dependency expression +suppresses one of: the running of alerts, the running of +monitors, or the passing of individual hosts to the monitors. +Read more about the behavior in the "Service Definitions" +section below. + +This is a global setting which controls the default +settings for the service-specified variable. .TP -.BI "trapbind = " addr +.BI "dep_memory = " timeval +If set, dep_memory will cause dependencies to continue to prevent +alerts/monitoring for a period of time after the service returns to a +normal state. This can be used to prevent over-eager alerting when a +machine is rebooting, for example. See the explanation of +.I interval +in the "Service Definitions" section +for a description of +.IR timeval . -.B serverbind -and -.B trapbind -specify which address to bind the server and trap ports to, respectively. -If these are not defined, the default address is INADDR_ANY, which -allows connections on all interfaces. For security reasons, -it could be a good idea to bind only to the loopback interface. +This is a global setting which controls the default +settings for the service-specified variable. + +.TP +.BI "dep_recur_limit = " depth +Limit dependency recursion level to +.IR depth . +If dependency recursion (dependencies which depend on other dependencies) +tries to go beyond +.IR depth , +then the recursion is aborted and a messages is logged to syslog. +The default limit is 10. .TP .BI "dtlogfile = " file @@ -780,23 +753,18 @@ is the frequency (in seconds) that the service is polled. is the summary line from when the service was failing. .TP -.BI "monerrfile = " filename -By default, when mon daemonizes itself, it connects -stdout and stderr to /dev/null. If -.B monerrfile -is set to a file, then stdout and stderr will be -appended to that file. In all cases stdin is connected -to /dev/null. If mon is told to run in the foreground -and to not daemonize, then none of this applies, since -stdin/stdout/stderr stay connected to whatever they -were at the time of invocation. - -.TP .BI "dtlogging = " yes/no Turns downtime logging on or off. The default is off. .TP +.BI "historicfile = " file +If this variable is set, then alerts are logged to +.IR file , +and upon startup, some (or all) of the past history is read +into memory. + +.TP .BI "histlength = " num .I num is the the maximum number of events to be retained @@ -806,39 +774,75 @@ This value may also be set by the command-line parameter. .TP -.BI "historicfile = " file -If this variable is set, then alerts are logged to -.IR file , -and upon startup, some (or all) of the past history is read -into memory. +.BI "logdir = " dir +.I dir +is the full path to the log directory. +.B mon +uses this directory to save various logs, including +the downtime log. If this path does not begin with a "/", it will be +relative to +.IR basedir . .TP -.BI "historictime = " timeval -.I num -is the amount of the history file to read upon startup. -"Now" - -.I timeval -is read. See the explanation of -.I interval -in the "Service Definitions" section -for a description of -.IR timeval . +.BI "maxprocs = " num +Throttles the number of concurrently forked processes to +.I num. +The intent is to provide a safety net for the unlikely situation +when the server tries to take on too many tasks at once. Note that this +situation has only been reported to happen when trying to use a garbled +configuration file! You don't want to use a garbled configuration +file now, do you? .TP -.BI "serverport = " port -.I port -is the TCP port number that the server should bind to. This value may also be +.BI "mondir = " dir +.I dir +is the full path to the monitor scripts. This value may also be set by the -.B \-p -command-line parameter. Normally this port is looked up via getservbyname(3), -and it defaults to 2583. +.B \-s +command-line parameter. If this path does not begin with a "/", it will be +relative to +.IR basedir . + +Multiple alert paths may be specified by separating them with +a colon. All paths must be absolute. + +When the configuration file is read, all monitors referenced from the +configuration will be looked up in each of these paths, and the +full path to the first +instance of the monitor found is stored in a hash. This hash is only +generated upon startup or after a "reset" command, so newly added monitor +scripts will not be recognized until a "reset" is performed. .TP -.BI "trapport = " port -.I port -is the UDP port number that the trap server should bind to. -Normally this port is looked up via getservbyname(3), -and it defaults to 2583. +.BI "monremote = " program + +If set, this external program will be called by Mon when various +client requests are processed. This can be used to propagate those +changes from one Mon server to another, if you have multiple +monitoring machines. An example script, +.B monremote.pl +is available in the clients directory. + +.TP +.BI "monerrfile = " filename +By default, when mon daemonizes itself, it connects +stdout and stderr to /dev/null. If +.B monerrfile +is set to a file, then stdout and stderr will be +appended to that file. In all cases stdin is connected +to /dev/null. If mon is told to run in the foreground +and to not daemonize, then none of this applies, since +stdin/stdout/stderr stay connected to whatever they +were at the time of invocation. + +.TP +.BI "pamservice = " service +The PAM service used for authentication. This is applicable +only if "pam" is specified as a parameter to the +.B authtype +setting. If this global is not defined, it defaults +to +.BR "passwd" . .TP .BI "pidfile = " path @@ -849,26 +853,6 @@ by the command-line parameter. .TP -.BI "maxprocs = " num -Throttles the number of concurrently forked processes to -.I num. -The intent is to provide a safety net for the unlikely situation -when the server tries to take on too many tasks at once. Note that this -situation has only been reported to happen when trying to use a garbled -configuration file! You don't want to use a garbled configuration -file now, do you? - -.TP -.BI "cltimeout = " secs -Sets the client inactivity timeout to -.I secs. -This is meant to help thwart denial of service attacks or -recover from crashed clients. -.I secs -is interpreted as a "1h/1m/1s" string, where -"1m" = 60 seconds. - -.TP .BI "randstart = " interval When the server starts, normally all services will not be scheduled until the interval defined in the respective service section. @@ -888,48 +872,16 @@ will be a random number between zero and seconds. .TP -.BI "dep_recur_limit = " depth -Limit dependency recursion level to -.IR depth . -If dependency recursion (dependencies which depend on other dependencies) -tries to go beyond -.IR depth , -then the recursion is aborted and a messages is logged to syslog. -The default limit is 10. - -.TP -.BI "dep_behavior = " {a|m|hm} -.B dep_behavior -controls whether the dependency expression -suppresses one of: the running of alerts, the running of -monitors, or the passing of individual hosts to the monitors. -Read more about the behavior in the "Service Definitions" -section below. - -This is a global setting which controls the default -settings for the service-specified variable. - -.TP -.BI "dep_memory = " timeval -If set, dep_memory will cause dependencies to continue to prevent -alerts/monitoring for a period of time after the service returns to a -normal state. This can be used to prevent over-eager alerting when a -machine is rebooting, for example. See the explanation of -.I interval -in the "Service Definitions" section -for a description of -.IR timeval . - -This is a global setting which controls the default -settings for the service-specified variable. +.BI "serverbind = " addr .TP -.BI "syslog_facility = " facility -Specifies the syslog facility used for logging. -.B daemon -is the default. - - +.BI "serverport = " port +.I port +is the TCP port number that the server should bind to. This value may also be +set by the +.B \-p +command-line parameter. Normally this port is looked up via getservbyname(3), +and it defaults to 2583. .TP .BI "startupalerts_on_reset = " {yes|no} @@ -939,14 +891,17 @@ If set to "yes", startupalerts will be invoked when the client command is executed. The default is "no". .TP -.BI "monremote = " program +.BI "statedir = " dir +.I dir +is the full path to the state directory. +.B mon +uses this directory to save various state information. -If set, this external program will be called by Mon when various -client requests are processed. This can be used to propagate those -changes from one Mon server to another, if you have multiple -monitoring machines. An example script, -.B monremote.pl -is available in the clients directory. +.TP +.BI "syslog_facility = " facility +Specifies the syslog facility used for logging. +.B daemon +is the default. .SS "Hostgroup Entries" @@ -963,7 +918,7 @@ The hostgroup definition ends with a blank line. For example: .RS .nf hostgroup servers nameserver smtpserver nntpserver - nfsserver httpserver smbserver + nfsserver httpserver smbserver hostgroup router_group cisco7000 agsplus .fi @@ -971,7 +926,7 @@ hostgroup router_group cisco7000 agsplus .SS "View Entries" View entries begin with the keyword -.BR view , +.BR view , and are followed by a view tag and the names of one or more hostgroups. The view tag must be composed of alphanumeric characters, a dash ("-"), a period ("."), @@ -1031,89 +986,16 @@ The following configuration parameters are valid only following a service definition: .TP -.BI VARIABLE= "value" -Environment variables may be defined for each service, which will be -included in the environment of monitors and alerts. Variables must -be specified in all capital letters, must begin with an alphabetical -character or an underscore, and there must be no spaces to the left -of the equal sign. - -.TP -.BI interval " timeval" -The keyword -.B interval -followed by a time value specifies the frequency that -a monitor script will be triggered. -Time values are defined as "30s", "5m", "1h", or "1d", -meaning 30 seconds, 5 minutes, 1 hour, or 1 day. The numeric portion -may be a fraction, such as "1.5h" or an hour and a half. This -format of a time specification will be referred to as -.IR timeval . - -.TP -.BI failure_interval " timeval" -Adjusts the polling interval to -.I timeval -when the service check is failing. Resets the interval -to the original when the service succeeds. - -.TP -.BI traptimeout " timeval" -This keyword takes the same time specification argument as -.BI interval , -and makes the service expect a trap from an external source -at least that often, else a failure will be registered. This is -used for a heartbeat-style service. - -.TP -.BI trapduration " timeval" -If a trap is received, the status of the service the trap was delivered -to will normally remain constant. If -.B trapduration -is specified, the status of the service will remain in a failure -state for the duration specified by -.IR timeval , -and then it will be reset to "success". - -.TP -.BI randskew " timeval" -Rather than schedule the monitor script to run at the start of each -interval, randomly adjust the interval specified by the -.B interval -parameter by plus-or-minus -.B "randskew". -The skew value is specified as the -.B interval -parameter: "30s", "5m", etc... -For example if -.B "interval" -is 1m, and -.B "randskew" -is "5s", then -.I mon -will schedule the monitor script some time between every -55 seconds and 65 seconds. -The intent is to help distribute the load on the server when -many services are scheduled at the same intervals. - -.TP -.BI monitor " monitor-name [arg...]" -The keyword -.B monitor -followed by a script name and arguments -specifies the monitor to run when the timer -expires. Shell-like quoting conventions are -followed when specifying the arguments to send -to the monitor script. -The script is invoked from the directory -given with the -.B \-s -argument, and all following words are supplied -as arguments to the monitor program, followed by the -list of hosts in the group referred to by the current watch group. -If the monitor line ends with ";;" as a separate word, -the host groups are not appended to the argument list -when the program is invoked. +\fB alertdepend, monitordepend, hostdepend\fP "dependexpression" +These keywords allow you to specify multiple dependency expressions of +different types. Each one corresponds to the different +.B dep_behavior +settings listed above. They will be evaluated independently in the different +contexts as listed above. If +.B depend +is present, it takes precedence over the matching keyword, depending on the +.B dep_behavior +setting. .TP .B allow_empty_group @@ -1126,26 +1008,6 @@ to invoke the monitor when all hosts in a hostgroup have been disabled. .TP -.BI description " descriptiontext" -The text following -.B description -is queried by client programs, passed to alerts and monitors via an -environment variable. It should contain a brief description of the -service, suitable for inclusion in an email or on a web page. - -.TP -.BI exclude_hosts " host [host...]" -Any hosts listed after -.B exclude_hosts -will be excluded from the service check. - -.TP -.BI exclude_period " periodspec" -Do not run a scheduled monitor during the time -identified by -.IR periodspec . - -.TP .BI depend " dependexpression" The .B depend @@ -1218,22 +1080,6 @@ expression will be used recursively in this case. .TP -.BI alertdepend " dependexpression" -.TP -.BI monitordepend " dependexpression" -.TP -.BI hostdepend " dependexpression" -These keywords allow you to specify multiple dependency expressions of -different types. Each one corresponds to the different -.B dep_behavior -settings listed above. They will be evaluated independently in the different -contexts as listed above. If -.B depend -is present, it takes precedence over the matching keyword, depending on the -.B dep_behavior -setting. - -.TP .BI "dep_memory " timeval If set, dep_memory will cause dependencies to continue to prevent alerts/monitoring for a period of time after the service returns to a @@ -1245,14 +1091,93 @@ for a description of .IR timeval . .TP -.BI redistribute " alert [arg...]" -A service may have one redistribute option, which is a special form of an -an alert definition. This alert will be called on every service status -update, even sequential success status updates. This can be used to -integrate Mon with another monitoring system, or to link together multiple -Mon servers via an alert script that generates Mon traps. See the "ALERT -PROGRAMS" section above for a list of the parameters mon will pass -automatically to alert programs. +.BI description " descriptiontext" +The text following +.B description +is queried by client programs, passed to alerts and monitors via an +environment variable. It should contain a brief description of the +service, suitable for inclusion in an email or on a web page. + +.TP +.BI exclude_hosts " host [host...]" +Any hosts listed after +.B exclude_hosts +will be excluded from the service check. + +.TP +.BI exclude_period " periodspec" +Do not run a scheduled monitor during the time +identified by +.IR periodspec . + +.TP +.BI failure_interval " timeval" +Adjusts the polling interval to +.I timeval +when the service check is failing. Resets the interval +to the original when the service succeeds. + +.TP +.BI interval " timeval" +The keyword +.B interval +followed by a time value specifies the frequency that +a monitor script will be triggered. +Time values are defined as "30s", "5m", "1h", or "1d", +meaning 30 seconds, 5 minutes, 1 hour, or 1 day. The numeric portion +may be a fraction, such as "1.5h" or an hour and a half. This +format of a time specification will be referred to as +.IR timeval . + +.TP +.BI monitor " monitor-name [arg...]" +The keyword +.B monitor +followed by a script name and arguments +specifies the monitor to run when the timer +expires. Shell-like quoting conventions are +followed when specifying the arguments to send +to the monitor script. +The script is invoked from the directory +given with the +.B \-s +argument, and all following words are supplied +as arguments to the monitor program, followed by the +list of hosts in the group referred to by the current watch group. +If the monitor line ends with ";;" as a separate word, +the host groups are not appended to the argument list +when the program is invoked. + +.TP +.BI randskew " timeval" +Rather than schedule the monitor script to run at the start of each +interval, randomly adjust the interval specified by the +.B interval +parameter by plus-or-minus +.B "randskew". +The skew value is specified as the +.B interval +parameter: "30s", "5m", etc... +For example if +.B "interval" +is 1m, and +.B "randskew" +is "5s", then +.I mon +will schedule the monitor script some time between every +55 seconds and 65 seconds. +The intent is to help distribute the load on the server when +many services are scheduled at the same intervals. + +.TP +.BI redistribute " alert [arg...]" +A service may have one redistribute option, which is a special form of an +an alert definition. This alert will be called on every service status +update, even sequential success status updates. This can be used to +integrate Mon with another monitoring system, or to link together multiple +Mon servers via an alert script that generates Mon traps. See the "ALERT +PROGRAMS" section above for a list of the parameters mon will pass +automatically to alert programs. .TP .BI unack_summary @@ -1261,6 +1186,32 @@ failure message changes. In most common usage the summary is the list of hosts that are failing, so additional hosts failing would remove an ack. +.TP +.BI traptimeout " timeval" +This keyword takes the same time specification argument as +.BI interval , +and makes the service expect a trap from an external source +at least that often, else a failure will be registered. This is +used for a heartbeat-style service. + +.TP +.BI trapduration " timeval" +If a trap is received, the status of the service the trap was delivered +to will normally remain constant. If +.B trapduration +is specified, the status of the service will remain in a failure +state for the duration specified by +.IR timeval , +and then it will be reset to "success". + +.TP +.BI VARIABLE= "value" +Environment variables may be defined for each service, which will be +included in the environment of monitors and alerts. Variables must +be specified in all capital letters, must begin with an alphabetical +character or an underscore, and there must be no spaces to the left +of the equal sign. + .SS "Period Definitions" @@ -1302,43 +1253,46 @@ must specify a label such as "period t1: wd {Sun-Sat}" and "period t2: wd {Sun-Sat}". .TP -.BI alertevery " timeval [observe_detail | strict]" -The -.B alertevery -keyword (within a -.B period -definition) takes the same type of argument as the -.B interval -variable, and limits the number of times an alert -is sent when the service continues to fail. -For example, if the interval is "1h", then only -the alerts in the period section will only -be triggered once every hour. If the -.B alertevery -keyword is -omitted in a period entry, an alert will be sent -out every time a failure is detected. By default, -if the summary output of two successive failures changes, -then the alertevery interval is overridden, and an alert -will be sent. -If the string -"observe_detail" is the last argument, then both the summary -and detail output lines will be considered when comparing the -output of successive failures. -If the string "strict" is the last argument, then the output -of the monitor or the state change of the service will have -no effect on when alerts are sent. That is, "alertevery 24h strict" -will send only one alert every 24 hours, no matter what. -Please refer to the -.B "ALERT DECISION LOGIC" -section for a detailed explanation of how alerts are suppressed. +.BI alert " alert [arg...]" +A period may contain multiple alerts, which are triggered +upon failure of the service. An alert is specified with +the +.B alert +keyword, followed by an optional +.B exit +parameter, and arguments which are interpreted the same as +the +.B monitor +definition, but without the ";;" exception. The +.B exit +parameter takes the form of +.B "exit=x" +or +.B "exit=x-y" +and has the effect that the alert is only called if the +exit status of the monitor script falls within the range +of the +.B exit +parameter. If, for example, the alert line is +.I "alert exit=10-20 mail.alert mis" +then +.I mail-alert +will only be invoked with +.I mis +as its arguments if the monitor +program's exit value is between 10 and 20. This feature +allows you to trigger different alerts at different +severity levels (like when free disk space goes from 8% to 3%). + +See the +.B "ALERT PROGRAMS" +section above for a list of the pramaeters mon will pass +automatically to alert programs. .TP .BI alertafter " num" - .TP .BI alertafter " num timeval" - .TP .BI alertafter " timeval" The @@ -1378,13 +1332,36 @@ of the number of failures noticed within that interval. .TP -.BI numalerts " num" - -This variable tells the server to call no more than -.I num -alerts during a -failure. The alert counter is kept on a per-period basis, -and is reset upon each success. +.BI alertevery " timeval [observe_detail | strict]" +The +.B alertevery +keyword (within a +.B period +definition) takes the same type of argument as the +.B interval +variable, and limits the number of times an alert +is sent when the service continues to fail. +For example, if the interval is "1h", then only +the alerts in the period section will only +be triggered once every hour. If the +.B alertevery +keyword is +omitted in a period entry, an alert will be sent +out every time a failure is detected. By default, +if the summary output of two successive failures changes, +then the alertevery interval is overridden, and an alert +will be sent. +If the string +"observe_detail" is the last argument, then both the summary +and detail output lines will be considered when comparing the +output of successive failures. +If the string "strict" is the last argument, then the output +of the monitor or the state change of the service will have +no effect on when alerts are sent. That is, "alertevery 24h strict" +will send only one alert every 24 hours, no matter what. +Please refer to the +.B "ALERT DECISION LOGIC" +section for a detailed explanation of how alerts are suppressed. .TP .B "no_comp_alerts" @@ -1394,41 +1371,32 @@ service state changes from failure to success, rather than only after a corresponding "down" alert. .TP -.BI alert " alert [arg...]" -A period may contain multiple alerts, which are triggered -upon failure of the service. An alert is specified with -the -.B alert -keyword, followed by an optional -.B exit -parameter, and arguments which are interpreted the same as -the -.B monitor -definition, but without the ";;" exception. The -.B exit -parameter takes the form of -.B "exit=x" -or -.B "exit=x-y" -and has the effect that the alert is only called if the -exit status of the monitor script falls within the range -of the -.B exit -parameter. If, for example, the alert line is -.I "alert exit=10-20 mail.alert mis" -then -.I mail-alert -will only be invoked with -.I mis -as its arguments if the monitor -program's exit value is between 10 and 20. This feature -allows you to trigger different alerts at different -severity levels (like when free disk space goes from 8% to 3%). +.BI numalerts " num" -See the -.B "ALERT PROGRAMS" -section above for a list of the pramaeters mon will pass -automatically to alert programs. +This variable tells the server to call no more than +.I num +alerts during a +failure. The alert counter is kept on a per-period basis, +and is reset upon each success. + +.TP +.BI startupalert " alert [arg...]" +A +.B startupalert +is only called when the +.B mon +server starts execution, or when a "reset" +command was issued to the server, depending on +the setting of the +.B startupalerts_on_reset +global. +Unlike other alerts, +.B startupalerts +are not called following the +exit of a monitor, i.e. they are +called in their own right, therefore the +"exit=" argument is not applicable to +.B startupalert. .TP .BI upalert " alert [arg...]" @@ -1451,30 +1419,11 @@ as an upalert. Multiple upalerts may be specified for each period definition. Set the per-period .B no_comp_alerts -option to +option to send an upalert regardless if whether or not a "down" alert was sent. .TP -.BI startupalert " alert [arg...]" -A -.B startupalert -is only called when the -.B mon -server starts execution, or when a "reset" -command was issued to the server, depending on -the setting of the -.B startupalerts_on_reset -global. -Unlike other alerts, -.B startupalerts -are not called following the -exit of a monitor, i.e. they are -called in their own right, therefore the -"exit=" argument is not applicable to -.B startupalert. - -.TP .BI upalertafter " timeval" The .B upalertafter @@ -1511,7 +1460,7 @@ one of the following statements: .RS .nf - command section + command section .fi .RE @@ -1519,7 +1468,7 @@ or .RS .nf - trap section + trap section .fi .RE @@ -1547,13 +1496,13 @@ An example configuration file: .RS .nf command section -list: all -reset: root,admin -loadstate: root -savestate: root +list: all +reset: root,admin +loadstate: root +savestate: root trap section -127.0.0.1 root r@@tp4sswrd +127.0.0.1 root r@@tp4sswrd .fi .RE @@ -1603,11 +1552,11 @@ Here is a simple config file example: .RS .nf watch trap-service - service host1-disks - description TRAP: for host1 disk status - period wd {Sun-Sat} - alert mail.alert some...@your.org - upalert mail.alert -u some...@your.org + service host1-disks + description TRAP: for host1 disk status + period wd {Sun-Sat} + alert mail.alert some...@your.org + upalert mail.alert -u some...@your.org .fi .RE @@ -1643,11 +1592,11 @@ Here is an example default facility: .RS .nf watch default - service default - description Default trap service - period wd {Sun-Sat} - alert mail.alert some...@your.org - upalert mail.alert -u some...@your.org + service default + description Default trap service + period wd {Sun-Sat} + alert mail.alert some...@your.org + upalert mail.alert -u some...@your.org .fi .RE -- 1.7.7.3