[Shinken-devel] Escalations based on time!

nap Tue, 04 Jan 2011 04:37:57 -0800

Hi folks,

I'm very happy to announce that we got the top asked feature in
http://shinken.ideascale.com : the escalation based on time!
*
Time based escalation example : *
Now you can define escalations like it :


# Escalate at 1 hour to Level2
define escalation{
       escalation_name          ToLevel2
       first_notification_time  60              ; at 1hour, go here
       last_notification_time   120             ; after 2 hours, stop here
       notification_interval    1440
       escalation_period        24x7
       escalation_options       d,u,r,w,c
       contacts                 level2
}

# At 2 hour, go level3, and stay here
define escalation{
       escalation_name          ToLevel3
       first_notification_time  120              ; at 2hours, go here
       last_notification_time   0             ; after, still go here
       notification_interval    1440
       escalation_period        24x7
       escalation_options       d,u,r,w,c
       contacts                 level3
}

And you called them in your host/services with just :
define host{
    host_name   webserver1
    [...]
    escalations     ToLevel2,ToLevel3
}

That's all for the definition, it's implicitly inherited, and much easier to
define that service/host-escalation that must be filed with host/services.

So here : after 1hour of notification, you escalate to the level2 contact,
then, after 2hours, you reach level3. Far more easier to understand than
with notification number isn't it?


*How it mix with the notification_interval :*
OK, you think "it's easy, you just put time as a multiple of the
notification period". No :)
That where it's not so easy to code in fact ;)

In this example, you can see the notification_interval at 1440 (one day). If
we kept the old way (notification raised with notification interval), we
got:
*t = 0 -> standard contact, here level1
*t = 1440 -> escalation to level3.
Where is level2? Not good.

So with time based escalation, the notification interval will be the min of
the next escalation time and the standard notification period. So in fact we
will have :
*t = 0 -> standard contact, here level1
*t = 60 -> escalation to level2
*t = 120 -> escalation to level3
*t = 120 + 1440 -> still level3. (once a day is a good thing for
notification)

:D

So now, define standard SLA is not a pain in the ass anymore! And link it to
host/service is also very easy because you just need to link escalations in
your templates.


*And now?*
We are getting closer to the 0.5. We already got 3 of our majors features :
* Criticity
* Business rules
* Escalation based on time

The next major one that is missing is downtime for contacts :)

We also need to code views for tools like Thruk, because such things as
problem/impact/criticity and business rules are good in a scheduler, but
they are even more useful with a real UI view :)
It's quite advanced for the problem/impact/criticity in Thruk for example
(look at https://github.com/sni/Thruk/commits/shinken to try it) :) I'll
make another mail with info about it in the next days.

I'm currently on the business rules for Thruk, when it's done, I'll update
the doc (if someone want to update it with what I said here, you can ;) ),
make screenshots for the web site, and we will be very very close to the 0.5
:D


Jean

------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl

_______________________________________________
Shinken-devel mailing list
Shinken-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shinken-devel

[Shinken-devel] Escalations based on time!

Reply via email to