Toussaint OTTAVI a écrit:

Following this idea, I will investigate the following :
- Hosts associated themselves with parent/child relationship according to WAN topology (already working) - For each host, I will create a "parent" service with only a check_alive command
- Every other service will be a child of this parent service

Answer to myself... After some investigations and doc readings :-) it seems I made a little confusion between "parent/child" and "dependency" :

- Parent/Child relationship is for hosts only, and should map network topology. When a host is DOWN, all the children are set to UNREACHABLE. But this parent/child relationship does not exist for services.

- Dependency can be either for hosts or services. When a dependant object is down, the "depended upon" object is not checked. But no assumption is made to the "depended upon" object status. Thus, it is not set to UNREACHABLE or UNKNOWN, such as for parent/child relationship.


Here's the actual situation :

- Creating a dependancy solves my problem of not checking services when hosts are unreachable due to WAN failure. This is a smarter solution than my previous attempt using event_handlers and DISABLE_ALL_SVC_CHECKS external command. Using wildcards, I just have to declare one dependency for all services on several hosts like this :

 define servicedependency{
host_name Remote_WAN_Router service_description Remote WAN router ping test dependent_host_name REMOTE_HOST1, REMOTE_HOST2, ..., REMOTE_HOSTn
   dependent_service_description      *
   inherits_parent                    1
   execution_failure_criteria         w,u,c
     }

- Doing that, when the WAN fails, the checks are not executed, and they keep their previous status. That's a good thing. But I would have prefered they get the status UNKNOWN or UNREACHABLE. In fact, I would like to have the same parent/child behavior that exists for hosts, but for services.

- I'm not sure it will solve the "latency" problem : if a service check attempt on remote_host occurs before the remote_wan_router is declared DOWN and the dependency does its job, then I'll still get critical failures for those services. The console will display a mix of FAILED services (those executed before the WAN router check) and some OK services (Previous state of services that will not be checked due to dependency). This display would be completely wrong !

Again, in such a situation, I think the right display for services whose status could not be determined should be "UNKNOWN". Same as hosts that are "UNREACHABLE"

Comments and ideas welcome.

Kind regards,
--

*Toussaint OTTAVI*
*MEDI INFORMATIQUE*
***Mail:* [EMAIL PROTECTED]

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Nagios-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Reply via email to