Well, deleting and adding the host stopped it for a while, but now overnight 
the duplicate (ping + host) notifications are back.

Is there a proper way to debug what is generating notifications, some way to 
see into the logic it used? 


-----Original Message-----
From: o...@leferguson.com 
Sent: Wednesday, January 15, 2014 7:12 PM
To: 'omd-users@lists.mathias-kettner.de'
Subject: RE: [omd-users] Duplicate messages for ping-only


Well, I found out something -- if I delete a host, and use my script to 
re-discover it and load it in hosts.mk, it stops giving duplicate notifications.

So something went wrong, I assume in my first attempts at this, but I still 
can't figure out how to tell what happened (I do have some more hosts giving 
duplicate messages if anyone has ideas).

But I can (I think) get rid of the duplicates by starting over and recreating 
them.  

I was (I think) a bit disappointed when I deleted and re-added a host to find 
it still had its history.  That would appear to mean it is keyed by name, and 
not some internal ID.  And retained when you delete with WATO.

Is there some garbage collection or cleanup that occurs after the fact that I 
just didn't wait for?  Or do deleted hosts keep data around forever? 

Linwood


-----Original Message-----
From: o...@leferguson.com
Sent: Wednesday, January 15, 2014 12:14 PM
To: 'Marcel Schulte'
Cc: omd-users@lists.mathias-kettner.de
Subject: RE: [omd-users] Duplicate messages for ping-only

>>> How do you configure your hosts (Nagios config files, CMK WATO, whatever)?

Sorry, I should have said - I am using the Multisite UI and trying to use WATO 
for everything I can (also struggling a bit to understand exactly which files 
are hand editable (or should be) and which WATO take care of for me).

Before I wrote this I went through the configs again, and am more confused, but 
do have a clue.

The devices were initially populated into hosts.mk by a script I wrote that did 
a nmap scan.  I tried to copy a hosts.mk that had been done manually for 
content and format.

But I now see that a host added manually to WATO does NOT give the extra 
notification.  But I can't seem to find out why.

When I added them to hosts.mk by script, I did it in a separate empty folder, 
creating a file (I thought) that looked like it should, with all_hosts, 
ipaddresses and host_attributes sections.   And indeed WATO seemed happy, did 
the inventory (in WATO not manually), found the right services, updated to the 
nagios files - seemed to be fine.

But I just went in and added a host manually, and it gives only a single 
notification.  So there's obviously a difference somewhere.

It's possible in some iteration of this (the script was not perfect the first 
time of course) it left some garbage around. 

I just went carefully thought the host.mk file (as it now exists after WATO 
took over) and the line for the added host and original host (which were the 
same IP, but I used a different name) are identical, except for the name 
(though I noticed I left off the domain name, but it's using ip address).   

I also did a find of all .cfg files with the host name in them, and compared 
lines in those (it was only in ./etc/nagios/conf.d/check_mk_objects.cfg) and as 
best I can see by eye it's identical there also, same tags, hostgroups, contact 
groups, service definition and contents.  

It is not a single host, it is every host that I added as ping only from the 
script, by the way.

My guess is that by adding it by a script I've screwed something up, but I 
don't know where I should look to find it.  Any suggestions where to look?  
Even if I need to wipe everything out and start again, I really want a script 
to do discovery, so I need to figure out how I screwed it up.




-----Original Message-----
From: Marcel Schulte [mailto:schulte.mar...@gmail.com]
Sent: Wednesday, January 15, 2014 2:26 AM
To: o...@leferguson.com
Cc: omd-users@lists.mathias-kettner.de
Subject: Re: [omd-users] Duplicate messages for ping-only

Hi Linwood,

if you have a ping-only host you don't have any services (of course only as 
long as you don't configure any other legacy/active checks) and because of this 
the PING service is displayed.

For CMK/SNMP hosts there are other services so no need for the PING service to 
appear ;-)

Back to your notification problem... Configure the PING service to be excluded 
from notification and you end up with the same behaviour as with the other 
hosts.

How do you configure your hosts (Nagios config files, CMK WATO, whatever)?

Regards,
Marcel

2014/1/15 o...@leferguson.com <o...@leferguson.com>:
> This is probably going to be a silly question as my first to the list, 
> so apologies in advance if so...
>
>
>
> Caveat: I have no prior experience with Nagios (et al), only 
> commercial systems like What's Up Gold.
>
>
>
> I got the latest (1.10) and have configured it on Ubuntu server 13.10, 
> using nagios as the core.  It's running, monitoring. I've scripted 
> host discovery with nmap and got it nicely discovering the lan, and 
> inventorying services.
> Am starting to experiment with varions options and configuration (e.g. 
> a distributed polling site soon and some wan tests).
>
>
>
> One really minor thing is bugging me, and I've wasted most of the day 
> looking for it.
>
>
>
> When I have a ping-only host that goes down, or comes back up, I get 
> two messages.  For example just got both:
>
>
>
> PING RECOVERY, CRITICAL -> OK (RECOVERY), Service PING
>
> RECOVERY, DOWN->UP (RECOVERY, state
>
>
>
> OK, I get that the node has a state as well as the service, but this 
> doesn't happen for SNMP polled devices nor for Check_MK agent devices.
> There I get just the state RECOVERY.
>
>
>
> These hosts have tags "lan|prod|ping", and I have not changed (at 
> least
> knowingly) any of the out of the box settings for the checks.
>
>
>
> Both messages are correct of course - it can't ping, and it is down.  
> But it seems like, consistent with SNMP/Check_MK agent devices, it 
> should only send the state message.  Is that correct?
>
>
>
> But I can't find any description (I haven't gone as far as browsing 
> the code - yet) of how it determines states relative to these tests 
> and knows not to continue to check services if the object state is 
> down.  I suspect I'm also going to have a problem with the first non-pingable 
> devices that I can poll
> for SNMP (etc) as well?   Not sure, will cross that later.
>
>
>
> Thanks in advance for any pointers,
>
>
>
> Linwood
>
>
>
> PS. If it matters notifications are going out via email.
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> omd-users mailing list
> omd-users@lists.mathias-kettner.de
> http://lists.mathias-kettner.de/mailman/listinfo/omd-users
>
_______________________________________________
omd-users mailing list
omd-users@lists.mathias-kettner.de
http://lists.mathias-kettner.de/mailman/listinfo/omd-users

Reply via email to