We are in the process of rebuilding our Nagios setup, which is an
ancient 1.x installation, to 3.2.1.

We want to re-implement the following policy for notifications:
 * During the work day, everyone in contactgroup gets notifications
 * Outside of working hours,
   * On-call person gets paged immediately
   * Everyone in contactgroup gets paged after 5 notifications

This is currently implemented by having 2 contact objects for every
user, one for work-hours and one for off-hours, each with appropriate
*_notification_periods. Notifications go to all N*2 contacts and the
individual contact's notification periods handles whether or not to
page.

I believe we can implement the policy using escalations. First, all
hosts & services have a contact that delivers notifications to a shared
IMAP folder, to get around the requirement of these objects having
contacts.

Next, to implement the work-hours piece, a service escalation such as:

define serviceescalation {
  hostgroup_name        unix-hosts
  service_description   *
  contact_groups        unix-admins
  first_notification    1
  last_notification     0
  escalation_period     work-hours
}

And a corresponding hostescalation.

To implement the off-hours piece, escalations such as:

define serviceescalation {
  hostgroup_name        unix-hosts
  contact_groups        on-call
  service_description   *
  first_notification    1
  last_notification     0
  escalation_period     off-hours
}

define serviceescalation {
  hostgroup_name        unix-hosts
  contact_groups        unix-admins
  service_description   *
  first_notification    5
  last_notification     0
  escalation_options    c
  escalation_period     off-hours
}

With this setup, all "real" notifications happen through these
escalations, in effect using them as a "servicenotification" object,
which kinda seems better from a data-normalization perspective (but
IANADBA).

Does this seem like it should work as I expect? Can anyone see any
problems? I have not actually tried this, so I don't know if it works.

As an aside, does anyone have any ideas of how to test this kind of
thing? Perhaps submitting passive check results and having timeperiods
of "oddminutes" and "evenminutes"?

Wil


------------------------------------------------------------------------------
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Reply via email to