Somehow I don't think Dirk and Andy are on the same wavelength.  I may be on yet
another wavelength but let me see if my summary makes any sense at all.
 
Andy had a problem where a service goes down and since it looks like its going to
take a while to fix, Andy would like to acknowledge that fact.  Said service doesn't
run all the time, so SA only checks to see if its up during those times that its 
supposed to be up.
 
The suggested way to acknowledge the lengthy repair time is to put the service into
maintenance mode.
 
Andy points out that if someone forgets to take the service out of maintenance mode,
then it doesn't get checked.  Since this service isn't checked all the time, he suggests
that as a new feature, SA allow for a self-terminating maintenance interval where the
termination time is determined by the check schedule.
 
Dirk responds that perhaps the termination time should just be a user specified value
of either a time of day or an interval at the time the service is put into maintenance mode.
He suggests that this might be more generally useful.
 
Dirk also indicates that there is no existing code which can calculate a time in the future
based on the check schedule making this option hard to implement.
 
Andy insists that his implementation would be better.  Dirk disagrees and here we are.
 
As I see the original problem, maintenance mode is a perfect way to acknowledge that a
service is being worked on and I also see Andy's point that maintenance mode could be
left on by accident preventing SA from every checking the service again.
 
Dirk's addition of "maintenance until" is an elegant solution.
 
At the time the service is determined to be down, an operator acknowledges the problem
and provides and estimated time to repair with a "maintenance until".  At the end of this
interval, SA will again monitor the service on the specified schedule, and report it as down
if "now" is within the check schedule.
 
This appears to me to work for any service being monitored.  Those monitored constantly
and those that are only checked periodically.  Now its time for Dirk and Andy to gang up
on me because I don't understand anything! <G>
 
Regards,
 
Brad Morgan
IT Manager
Horizon Interactive Inc.

Reply via email to