> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of David Miller
> Sent: Tuesday, January 09, 2007 9:09 AM
> To: Hendrik Bäcker
> Cc: [email protected]
> Subject: Re: [Nagios-users] parent/child setup not working (Resolved)
>
> Hendrik Bäcker wrote:
>
> I finally got this figured out, and thought I'd send a summary for the
> archives. Hopefully it will help others with the search engines and
> archives.
Thanks!
> > Don't think on a checking way like this:
> >
> > Parent Host --> Host --> Service
> >
> > It is more like this:
> >
> > Service --> Host --> Parent
> >
> > Nagios intelligence is that it suppress notifications not the service
> > checks.
> >
>
> This is pretty much the key. Perhaps the documentation could be
> clearer; it's possible to read the documentation the right way, but a
> lot of people thought my configs looked right, so it's easy to read it
> the wrong way as well.
>
Feel free to submit suggested changes =) Logically, this part of the host{}
definition combined with the Determining Status And Reachability of Network
hosts is sufficient for me --
"check_command: This directive is used to specify the short name of the
command that should be used to check if the host is up or down. Typically, this
command would try and ping the host to see if it is "alive". The command must
return a status of OK (0) or Nagios will assume the host is down. ***If you
leave this argument blank, the host will not be checked - Nagios will always
assume the host is up.*** This is useful if you are monitoring printers or
other devices that are frequently turned off. The maximum amount of time that
the notification command can run is controlled by the host_check_timeout
option."
Emphasis mine. If the host status is UP, then it's not possible for it's parent
to be down so no check is performed.
>
> The solution requires a host check be specified for every host with a
> service check. If the service check fails nagios will perform said host
> check, determine the host is unreachable. If a parent host (pix) is
> defined for the unreachable host (web) notifications will be
> surpressed. So if you don't want notifications about hosts beyond a
No quite right. The check of the host that the service is running on only
determines if that host is up/down. If the parent for that host is reachable
then it is assumed that the host is down, otherwise the host is unreachable.
That process works it's way up the tree to the top parent.
> As an aside, the documentation alludes to performance issues from host
> check, most of which are based on pings. Is this due to the nastiness
> of handling icmp packets on their return? IE, when an icmp packet is
> received the kernel hands a copy of it to all processes listening for a
> reply; this gets ugly quickly if too many processes are pinging at
> once. If that's the case I have some code I'd be happy to donate that
> solves that particular problem.
No, the problem is that in the current version of nagios, performing a host
check stops _all_ other activity until the check has been performed up to max
check attempts. No other service checks, host checks, notifications, anything.
If you have a host check that pings 10 times, and a max_checks of 3, nagios
will stop everything for 30 seconds or longer. If you have 10 hosts down,
you've now stopped everything else for 3 minutes. If you're checking parents
for those hosts, add more time. The problem compounds quickly. That's why you
want host checks to be simple and quick.
--
Marc
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nagios-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue.
::: Messages without supporting info will risk being sent to /dev/null