Well, deleting and adding the host stopped it for a while, but now overnight the duplicate (ping + host) notifications are back.
Is there a proper way to debug what is generating notifications, some way to see into the logic it used? -----Original Message----- From: o...@leferguson.com Sent: Wednesday, January 15, 2014 7:12 PM To: 'omd-users@lists.mathias-kettner.de' Subject: RE: [omd-users] Duplicate messages for ping-only Well, I found out something -- if I delete a host, and use my script to re-discover it and load it in hosts.mk, it stops giving duplicate notifications. So something went wrong, I assume in my first attempts at this, but I still can't figure out how to tell what happened (I do have some more hosts giving duplicate messages if anyone has ideas). But I can (I think) get rid of the duplicates by starting over and recreating them. I was (I think) a bit disappointed when I deleted and re-added a host to find it still had its history. That would appear to mean it is keyed by name, and not some internal ID. And retained when you delete with WATO. Is there some garbage collection or cleanup that occurs after the fact that I just didn't wait for? Or do deleted hosts keep data around forever? Linwood -----Original Message----- From: o...@leferguson.com Sent: Wednesday, January 15, 2014 12:14 PM To: 'Marcel Schulte' Cc: omd-users@lists.mathias-kettner.de Subject: RE: [omd-users] Duplicate messages for ping-only >>> How do you configure your hosts (Nagios config files, CMK WATO, whatever)? Sorry, I should have said - I am using the Multisite UI and trying to use WATO for everything I can (also struggling a bit to understand exactly which files are hand editable (or should be) and which WATO take care of for me). Before I wrote this I went through the configs again, and am more confused, but do have a clue. The devices were initially populated into hosts.mk by a script I wrote that did a nmap scan. I tried to copy a hosts.mk that had been done manually for content and format. But I now see that a host added manually to WATO does NOT give the extra notification. But I can't seem to find out why. When I added them to hosts.mk by script, I did it in a separate empty folder, creating a file (I thought) that looked like it should, with all_hosts, ipaddresses and host_attributes sections. And indeed WATO seemed happy, did the inventory (in WATO not manually), found the right services, updated to the nagios files - seemed to be fine. But I just went in and added a host manually, and it gives only a single notification. So there's obviously a difference somewhere. It's possible in some iteration of this (the script was not perfect the first time of course) it left some garbage around. I just went carefully thought the host.mk file (as it now exists after WATO took over) and the line for the added host and original host (which were the same IP, but I used a different name) are identical, except for the name (though I noticed I left off the domain name, but it's using ip address). I also did a find of all .cfg files with the host name in them, and compared lines in those (it was only in ./etc/nagios/conf.d/check_mk_objects.cfg) and as best I can see by eye it's identical there also, same tags, hostgroups, contact groups, service definition and contents. It is not a single host, it is every host that I added as ping only from the script, by the way. My guess is that by adding it by a script I've screwed something up, but I don't know where I should look to find it. Any suggestions where to look? Even if I need to wipe everything out and start again, I really want a script to do discovery, so I need to figure out how I screwed it up. -----Original Message----- From: Marcel Schulte [mailto:schulte.mar...@gmail.com] Sent: Wednesday, January 15, 2014 2:26 AM To: o...@leferguson.com Cc: omd-users@lists.mathias-kettner.de Subject: Re: [omd-users] Duplicate messages for ping-only Hi Linwood, if you have a ping-only host you don't have any services (of course only as long as you don't configure any other legacy/active checks) and because of this the PING service is displayed. For CMK/SNMP hosts there are other services so no need for the PING service to appear ;-) Back to your notification problem... Configure the PING service to be excluded from notification and you end up with the same behaviour as with the other hosts. How do you configure your hosts (Nagios config files, CMK WATO, whatever)? Regards, Marcel 2014/1/15 o...@leferguson.com <o...@leferguson.com>: > This is probably going to be a silly question as my first to the list, > so apologies in advance if so... > > > > Caveat: I have no prior experience with Nagios (et al), only > commercial systems like What's Up Gold. > > > > I got the latest (1.10) and have configured it on Ubuntu server 13.10, > using nagios as the core. It's running, monitoring. I've scripted > host discovery with nmap and got it nicely discovering the lan, and > inventorying services. > Am starting to experiment with varions options and configuration (e.g. > a distributed polling site soon and some wan tests). > > > > One really minor thing is bugging me, and I've wasted most of the day > looking for it. > > > > When I have a ping-only host that goes down, or comes back up, I get > two messages. For example just got both: > > > > PING RECOVERY, CRITICAL -> OK (RECOVERY), Service PING > > RECOVERY, DOWN->UP (RECOVERY, state > > > > OK, I get that the node has a state as well as the service, but this > doesn't happen for SNMP polled devices nor for Check_MK agent devices. > There I get just the state RECOVERY. > > > > These hosts have tags "lan|prod|ping", and I have not changed (at > least > knowingly) any of the out of the box settings for the checks. > > > > Both messages are correct of course - it can't ping, and it is down. > But it seems like, consistent with SNMP/Check_MK agent devices, it > should only send the state message. Is that correct? > > > > But I can't find any description (I haven't gone as far as browsing > the code - yet) of how it determines states relative to these tests > and knows not to continue to check services if the object state is > down. I suspect I'm also going to have a problem with the first non-pingable > devices that I can poll > for SNMP (etc) as well? Not sure, will cross that later. > > > > Thanks in advance for any pointers, > > > > Linwood > > > > PS. If it matters notifications are going out via email. > > > > > > > > > > > _______________________________________________ > omd-users mailing list > omd-users@lists.mathias-kettner.de > http://lists.mathias-kettner.de/mailman/listinfo/omd-users > _______________________________________________ omd-users mailing list omd-users@lists.mathias-kettner.de http://lists.mathias-kettner.de/mailman/listinfo/omd-users