Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-09 Thread Carl Friend
   Patrick Morris wirtes:

HS> ?? How. Or I'm stupid? How should the host respond to service checks
HS> if it's down and doesn't respond to ping therefore?
>
> It happens.  For example, I have two switches in a strange state
> that I can't ping, but otherwise work fine.

   Technically speaking, all a "ping" tells one these days is that
the path to the host is operational, not necessarily that the host
itself is alive.  I've seen plenty of hosts that have completely
and totally wedged up happily respond to an ICMP ping because the
responsibility for that has been devolved to the NIC.  So, what one
winds up with is a host that's reported as "UP", when in fact all the
services are deader than doornails.

  The anoying thing with the way Nagios handles things in this
regard is if a ping doesn't work (and lots of sites block ICMP
echo requests as part of a "security through obscurity" scheme)
one winds up with a "DOWN" host and lots of "UP" services.  But,
as has been alluded to, getting the logic right can be problematic.
There may be no "right" answer.

+-++
| Carl Richard Friend (UNIX Sysadmin) | Natick, Massachusetts, USA |
| Minicomputer Collector / Enthusiast | 01760-2098 |
| mailto:[EMAIL PROTECTED]++
| http://users.rcn.com/crfriend/museum| ICBM: +42:18:00  -71:21:03 |
+-++

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-09 Thread Heiko Schlittermann
Greg King <[EMAIL PROTECTED]> (Mi 09 Apr 2008 08:13:51 CEST):
> > > Heiko Schlittermann
>  
> The normal check_host_alive command is ping, but it might not work for some 
> hosts like firewalls, etc.  For these hosts use NMAP to scan for an open TCP 
> port on the host (ssh or http ports are frequently open), then create a 
> check_host_alive2 that does a check_tcp to that known open port, and override 
> the default check_host_alive for the hosts in question, or create a new group 
> for these hosts and use the new check_host_alive2 command. 

Yep. Thank you. This possibility I know, I was just irritated from/by
(?) the logic behind the host checks. But thanks to all answers now I
understand the reason for triggering a host check as soon as *any*
service fails. (That doesn't mean that I agree with this logic, but I
understand it ;-))

Best regards from Dresden
Viele Grüße aus Dresden
Heiko Schlittermann
-- 
 SCHLITTERMANN.de  internet & unix support -
 Heiko Schlittermann HS12-RIPE -
 gnupg encrypted messages are welcome - key ID: 48D0359B ---
 gnupg fingerprint: 3061 CFBF 2D88 F034 E8D2  7E92 EE4E AC98 48D0 359B -


signature.asc
Description: Digital signature
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-09 Thread Matthew Macdonald-Wallace
On Tue, 8 Apr 2008 12:50:01 +0200
Heiko Schlittermann <[EMAIL PROTECTED]> wrote:

> I've a list of hosts, these hosts are not available for ping, but
> normal service checks (SSH, SMTP, ...) work. Nagios reports theses
> hosts beeing down! Ugly!
> 
> If I remember well, older nagios versions "knew" that's enough to see
> one service on a host to know this host has to be up.

Not sure if it's available in 3.x (we're still happily using 2.x here!)
however we use ssh as our check-host-alive command for the reasons
that you've listed above (we block most if not all ICMP traffic to
our hosts unless it's specifically required.). A connection is
initialised via ssh and if that works, the host is deemed to be up.

We also use NRPE to check that sshd is running so if the host appears
as down however the all the services apart from sshd are up, we know
that it is most likely an issue with sshd, not the server.

Hope this helps,

Matt
-- 
|Matthew Macdonald-Wallace
|Tiger Computing Ltd
|"The Linux Specialists"
|
|Tel: 0330 088 1511
|Web: http://www.tiger-computing.co.uk
|
|Registered in England. Company number: 3389961
|Registered address: Wyastone Business Park,
| Wyastone Leys, Monmouth, NP25 3SR

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Greg King
> Message: 8
> Date: Tue, 8 Apr 2008 09:32:25 -0800
> From: Israel Brewster 
> Subject: Re: [Nagios-users] Too stupid? Services are available, but
> nagios reports host to be down!
> To: Heiko Schlittermann 
> Cc: nagios-users 
> Message-ID: 
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes
> 
> On Apr 8, 2008, at 2:50 AM, Heiko Schlittermann wrote:
> > Hello,
> >
> > (using 3.0.1)
> >
> > I've a list of hosts, these hosts are not available for ping, but 
> > normal
> > service checks (SSH, SMTP, ...) work. Nagios reports theses hosts 
> > beeing
> > down! Ugly!
> >
> > If I remember well, older nagios versions "knew" that's enough to see
> > one service on a host to know this host has to be up.
> 
> To a degree, yes- if you aren't actively checking the host (as would 
> appear to be the case from your next paragraph), then as long as all 
> services on the host are listed as ok, nagios assumes the host is 
> still ok (at least once running, I don't know how it behaves on the 
> initial check). However, should any of the services go into a non-ok 
> state, nagios will immediately check the host (using the host 
> check_command), wherupon, in your case, it would determine the host to 
> be down since it can't ping. The state of the other services does not 
> affect this process, so any other services do not change state.
> 
> > The host check_command is the normale 'check-host-alive' (which is
> > pinging), the check_interval is 0 -- why does nagios want to check 
> > that
> > host?
> 
> Because at some point one or more of the services went into a non-ok 
> state.
> 
> > The check_command is inherited from some template, if I try to 
> > override
> > it with no value, nagios complains:
> >
> > Error: Host check command '(null)' specified for host 'diwi/diw' is 
> > not defined anywhere
> 
> Yep- you can't have no value in the check_command directive. If you 
> just want to assume the host is up all the time, you can use the 
> check_dummy plugin (after defining a check_dummy command in your 
> checkcommands.cfg, naturally). Otherwise you'll need to figure out 
> some check Nagios can perform to determine if the host is running, 
> even if that check is just checking one of the services again or 
> something.
> 
> ---
> Israel Brewster
> Computer Support Technician
> Frontier Flying Service Inc.
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7250 x293
> ---
> >
> >
> >
> > So - please, could anybody point to my stupidity?
> >
> > Thanks.
> >
> >
> > Best regards from Dresden
> > Viele Gr??e aus Dresden
> > Heiko Schlittermann
 
The normal check_host_alive command is ping, but it might not work for some 
hosts like firewalls, etc.  For these hosts use NMAP to scan for an open TCP 
port on the host (ssh or http ports are frequently open), then create a 
check_host_alive2 that does a check_tcp to that known open port, and override 
the default check_host_alive for the hosts in question, or create a new group 
for these hosts and use the new check_host_alive2 command. 
 
Regards,
Greg King
www.wgk-consulting.com


  

You rock. That's why Blockbuster's offering you one month of Blockbuster Total 
Access, No Cost.  
http://tc.deals.yahoo.com/tc/blockbuster/text5.com

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Steve Shipway
> I've a list of hosts, these hosts are not available for ping, but
normal
> service checks (SSH, SMTP, ...) work. Nagios reports theses hosts
beeing
> down! Ugly!

On our system, we too have a small subset of hosts which cannot be
pinged.  However, they can be SSH'ed.  So, I defined a new test,
check-host-alive-ssh which used an SSH connection rather than a ping,
and define this as the host_check_command for the hosts in question.
This allows Nagios to continue to work as expected.

Steve

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Patrick Morris
Heiko Schlittermann schrieb am Tuesday, den 08. April 2008:

> Folkert van Heusden <[EMAIL PROTECTED]> (Di 08 Apr 2008 23:16:55 CEST):
> > > | But - my question here, why is *any* failing service a trigger of a host
> > > | check? Shouldn't be the failure of *all* services this trigger?
> >
> > Sorry to drop-in (and maybe saying something stupid) but as far as i
> > know the host may be down if it stopped responding to ping but still
> > responds to service-checks.
> 
> ?? How. Or I'm stupid? How should the host respond to service checks
> if it's down and doesn't respond to ping therefore?

It happens.  For example, I have two switches in a strange state that I
can't ping, but otherwise work fine.

That's beside the point, though... Under normal circumstances, Nagios
will assume that, if your host check fails, all your services should be
considered broken, too.  That's why a host check should always be
designed in a way that if failways always, and only, if a host should be
considered down.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Heiko Schlittermann
Folkert van Heusden <[EMAIL PROTECTED]> (Di 08 Apr 2008 23:16:55 CEST):
> > | But - my question here, why is *any* failing service a trigger of a host
> > | check? Shouldn't be the failure of *all* services this trigger?
> 
> Sorry to drop-in (and maybe saying something stupid) but as far as i
> know the host may be down if it stopped responding to ping but still
> responds to service-checks.

?? How. Or I'm stupid? How should the host respond to service checks
if it's down and doesn't respond to ping therefore?

-- 
Heiko


signature.asc
Description: Digital signature
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Folkert van Heusden
> | But - my question here, why is *any* failing service a trigger of a host
> | check? Shouldn't be the failure of *all* services this trigger?

Sorry to drop-in (and maybe saying something stupid) but as far as i
know the host may be down if it stopped responding to ping but still
responds to service-checks.


Folkert van Heusden

-- 

Multitail - gibkaja utilita po sledovaniju log-fajlov i vyvoda
kommand. Fil'trovanie, raskrašivanie, slijanie, vizual'noe sravnenie,
i t.d.  http://www.vanheusden.com/multitail/
--
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Hugo van der Kooij
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Heiko Schlittermann wrote:

| But - my question here, why is *any* failing service a trigger of a host
| check? Shouldn't be the failure of *all* services this trigger?

Well. If a service goes missing it might very well be an issue with the
host or even further up the line. So The first thing to do is see if the
host is still there. Then if the host is still there other service
checks will go on. But if the host is down there is little point in
sending out alerts on other missing services untill you got them all
only to report a down host there.

By doing an immediate host check you only have to send a host down
notification and be done with it if the host is down.

Hugo.

- --
[EMAIL PROTECTED]   http://hugo.vanderkooij.org/
PGP/GPG? Use: http://hugo.vanderkooij.org/0x58F19981.asc

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?

Bored? Click on http://spamornot.org/ and rate those images.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFH++AIBvzDRVjxmYERAoSMAJ9aAfJzGwQ67xSUdtWS4NSolaqNWgCggmT/
dHTaLDuiRElOpkugiF0t0bY=
=bHTn
-END PGP SIGNATURE-

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Marc Powell


> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Heiko Schlittermann
> Sent: Tuesday, April 08, 2008 2:44 PM
> To: Israel Brewster
> 

> But - my question here, why is *any* failing service a trigger of a
host
> check? Shouldn't be the failure of *all* services this trigger?

Service A gets checked every 5 minutes.
Service B gets checked once a day.

Do you _really_ want to wait to know if the host is down until Service B
is checked and fails?

The problem you're experiencing is an artifact of your configuration
methodology. If you don't want hosts checked, ever, do not specify a
host check_command (i.e. leave the entire line out). The fact that you
have it included in a template applied to the host is why it's being
checked. Use a template that doesn't specify a host check_command.

--
Marc

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Patrick Morris
Moin Heiko!

Heiko Schlittermann schrieb am Tuesday, den 08. April 2008:

> But - my question here, why is *any* failing service a trigger of a host
> check? Shouldn't be the failure of *all* services this trigger?

This is so that if the outage of a service is caused by the host being
down, you are notified that the host is down, and not the service.

If Nagios waited for every service to fail first, you'd get a lot of
alerts you probably don't want.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Heiko Schlittermann
Hello Israel,

Israel Brewster <[EMAIL PROTECTED]> (Di 08 Apr 2008 19:32:25 CEST):
> On Apr 8, 2008, at 2:50 AM, Heiko Schlittermann wrote:
> >Hello,
> >
> >(using 3.0.1)
> >
> >I've a list of hosts, these hosts are not available for ping, but  
> >normal
> >service checks (SSH, SMTP, ...) work. Nagios reports theses hosts  
> >beeing
> >down! Ugly!
> >
> >If I remember well, older nagios versions "knew" that's enough to see
> >one service on a host to know this host has to be up.
> 
> To a degree, yes- if you aren't actively checking the host (as would  
> appear to be the case from your next paragraph), then as long as all  
> services  on the host are listed as ok, nagios assumes the host is  
> still ok (at least once running, I don't know how it behaves on the  
> initial check). However, should any of the services go into a non-ok  
> state, nagios will immediately check the host (using the host  
> check_command), wherupon, in your case, it would determine the host to  
> be down since it can't ping. The state of the other services does not  
> affect this process, so any other services do not change state.

That's an interesting detail: if ANY of the service checks fails, a host
scheck is scheduled.

This would explain why the host check takes place and fails (if it's
using "ping").

But - my question here, why is *any* failing service a trigger of a host
check? Shouldn't be the failure of *all* services this trigger?

-- 
Heiko


signature.asc
Description: Digital signature
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Israel Brewster
On Apr 8, 2008, at 2:50 AM, Heiko Schlittermann wrote:
> Hello,
>
> (using 3.0.1)
>
> I've a list of hosts, these hosts are not available for ping, but  
> normal
> service checks (SSH, SMTP, ...) work. Nagios reports theses hosts  
> beeing
> down! Ugly!
>
> If I remember well, older nagios versions "knew" that's enough to see
> one service on a host to know this host has to be up.

To a degree, yes- if you aren't actively checking the host (as would  
appear to be the case from your next paragraph), then as long as all  
services  on the host are listed as ok, nagios assumes the host is  
still ok (at least once running, I don't know how it behaves on the  
initial check). However, should any of the services go into a non-ok  
state, nagios will immediately check the host (using the host  
check_command), wherupon, in your case, it would determine the host to  
be down since it can't ping. The state of the other services does not  
affect this process, so any other services do not change state.

> The host check_command is the normale 'check-host-alive' (which is
> pinging), the check_interval is 0 -- why does nagios want to check  
> that
> host?

Because at some point one or more of the services went into a non-ok  
state.

> The check_command is inherited from some template, if I try to  
> override
> it with no value, nagios complains:
>
> Error: Host check command '(null)' specified for host 'diwi/diw' is  
> not defined anywhere

Yep- you can't have no value in the check_command directive. If you  
just want to assume the host is up all the time, you can use the  
check_dummy plugin (after defining a check_dummy command in your  
checkcommands.cfg, naturally). Otherwise you'll need to figure out  
some check Nagios can perform to determine if the host is running,  
even if that check is just checking one of the services again or  
something.

---
Israel Brewster
Computer Support Technician
Frontier Flying Service Inc.
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7250 x293
---
>
>
>
> So - please, could anybody point to my stupidity?
>
> Thanks.
>
>
>Best regards from Dresden
>Viele Grüße aus Dresden
>Heiko Schlittermann
> -- 
> SCHLITTERMANN.de  internet & unix  
> support -
> Heiko Schlittermann HS12-RIPE  
> -
> gnupg encrypted messages are welcome - key ID: 48D0359B  
> ---
> gnupg fingerprint: 3061 CFBF 2D88 F034 E8D2  7E92 EE4E AC98 48D0  
> 359B -
> -
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> Register now and save $200. Hurry, offer ends at 11:59 p.m.,
> Monday, April 7! Use priority code J8TLD2.
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when  
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Marc Powell


> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Heiko Schlittermann


> > That one would be non-trivial, but perhaps the check command could
read
> > current status for the host and only do test if there are no
services
> > currently in OK state? Otherwise it would return the host is UP.
> >
> 
> That's it. Is there any easy to access interface where I can read the
> status of service checks of the host in question?

Using the check_cluster(2) plugin for your host check perhaps? I don't
use the plugin myself but the concepts are the same...

--
Marc

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Heiko Schlittermann
Wojciech Kocjan <[EMAIL PROTECTED]> (Di 08 Apr 2008 14:34:38 CEST):
> Dnia 08-04-2008 o 14:21:08 Heiko Schlittermann <[EMAIL PROTECTED]>  
> napisał(a):
> 
> That one would be non-trivial, but perhaps the check command could read  
> current status for the host and only do test if there are no services  
> currently in OK state? Otherwise it would return the host is UP.
> 

That's it. Is there any easy to access interface where I can read the
status of service checks of the host in question?


Best regards from Dresden
Viele Grüße aus Dresden
Heiko Schlittermann
-- 
 SCHLITTERMANN.de  internet & unix support -
 Heiko Schlittermann HS12-RIPE -
 gnupg encrypted messages are welcome - key ID: 48D0359B ---
 gnupg fingerprint: 3061 CFBF 2D88 F034 E8D2  7E92 EE4E AC98 48D0 359B -


signature.asc
Description: Digital signature
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Wojciech Kocjan
Dnia 08-04-2008 o 14:21:08 Heiko Schlittermann <[EMAIL PROTECTED]>  
napisał(a):

> Hello,
>
> Valdinger, Stephen (DOV, MSX) <[EMAIL PROTECTED]> (Di 08 Apr 2008  
> 13:58:36 CEST):
>> Just rewrite the check-host-alive plugin for the unpingable hosts to  
>> something bogus like check_null for some bogus output. Or use a  
>> different check command other than a ping for the check-host alive.
>
> Thank you for your response.
>
> But - a bogus check returning "OK" would be wrong too, because the host
> *can* be down.
>
> The host should assumed to be UP if *any* service check on this host was
> successful. The host should assumed to be DOWN, if all service checks
> failed.
>
> Any suggestion anybody?

That one would be non-trivial, but perhaps the check command could read  
current status for the host and only do test if there are no services  
currently in OK state? Otherwise it would return the host is UP.

-- 
Wojciech Kocjan

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Heiko Schlittermann
Hello,

Valdinger, Stephen (DOV, MSX) <[EMAIL PROTECTED]> (Di 08 Apr 2008 13:58:36 
CEST):
> Just rewrite the check-host-alive plugin for the unpingable hosts to 
> something bogus like check_null for some bogus output. Or use a different 
> check command other than a ping for the check-host alive.

Thank you for your response.

But - a bogus check returning "OK" would be wrong too, because the host
*can* be down.

The host should assumed to be UP if *any* service check on this host was
successful. The host should assumed to be DOWN, if all service checks
failed.

Any suggestion anybody?

Best regards from Dresden
Viele Grüße aus Dresden
Heiko Schlittermann
-- 
 SCHLITTERMANN.de  internet & unix support -
 Heiko Schlittermann HS12-RIPE -
 gnupg encrypted messages are welcome - key ID: 48D0359B ---
 gnupg fingerprint: 3061 CFBF 2D88 F034 E8D2  7E92 EE4E AC98 48D0 359B -


signature.asc
Description: Digital signature
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Too stupid? Services are available, but nagios reports host to be down!

2008-04-08 Thread Heiko Schlittermann
Hello,

(using 3.0.1)

I've a list of hosts, these hosts are not available for ping, but normal
service checks (SSH, SMTP, ...) work. Nagios reports theses hosts beeing
down! Ugly!

If I remember well, older nagios versions "knew" that's enough to see
one service on a host to know this host has to be up.

The host check_command is the normale 'check-host-alive' (which is
pinging), the check_interval is 0 -- why does nagios want to check that
host?

The check_command is inherited from some template, if I try to override
it with no value, nagios complains: 

Error: Host check command '(null)' specified for host 'diwi/diw' is not defined 
anywhere


So - please, could anybody point to my stupidity?

Thanks.


Best regards from Dresden
Viele Grüße aus Dresden
Heiko Schlittermann
-- 
 SCHLITTERMANN.de  internet & unix support -
 Heiko Schlittermann HS12-RIPE -
 gnupg encrypted messages are welcome - key ID: 48D0359B ---
 gnupg fingerprint: 3061 CFBF 2D88 F034 E8D2  7E92 EE4E AC98 48D0 359B -


signature.asc
Description: Digital signature
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Register now and save $200. Hurry, offer ends at 11:59 p.m., 
Monday, April 7! Use priority code J8TLD2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null