Hi,

I think I have found the issue. A forgotten "notifies = no" (a mystery how that 
happened) in a poller configuration caused the poller to notify before the 
master (which ofcourse fails as the scripts for notification are not present on 
the poller). As such merlin behaved properly causing the master to not renotify.
Corrected it and sofar it seems to renotify properly. Since it was only 1 
poller it would also explain why the issue came and went ... (depending on 
which poller executed the check)

Frederik Vervaet
UNIX Monitoring (HES) | ITS/IIS
http://mon.bc

[Proximus]<http://www.proximus.be/>

Connect with us on:

[Proximus Facebook]<https://www.facebook.com/proximusBe>   [Proximus Twitter] 
<https://twitter.com/proximus>    [Proximus YouTube] 
<https://www.youtube.com/proximus>    [Proximus LinkedIn] 
<https://www.linkedin.com/company/proximus>

From: Naemon-users 
[mailto:naemon-users-bounces+frederik.vervaet=proximus....@monitoring-lists.org]
 On Behalf Of VERVAET Frederik (ITS/IIS)
Sent: Thursday 17 September 2015 23:23
To: [email protected]
Subject: [naemon-users] Weird issue with notification_interval (possible bug ?)

Hi,

I noticed the following oddity the other day :

When I set the notification interval to 5  mins I see in Ninja GUI (I use 
naemon+merlin) the notification listed as sent but NO actual notification was 
sent out.

When I enable debug logs in naemon I see :

[1442523762.409221] [032.0] [pid=28500] ** Service Notification Attempt ** 
Host: 'machineX', Service: 'Service X', Type: NORMAL, Options: 0, Current 
State: 2, Last Notification: Thu Sep 17 23:02:42 2015
[1442523762.409273] [032.0] [pid=28500] SERVICE NOTIFICATION SUPPRESSED: 
machineX;Service X;Re-notification blocked for this problem because not enough 
time has passed since last notification.[1442523762.409287] [032.1] [pid=28500] 
Next valid notification time: Thu Sep 17 23:07:42 2015
1442523762 = Thu, 17 Sep 2015 21:02:42 GMT = Thu, 17 Sep 2015 23:02:42 (Local 
time)

(do note the 3rd log entry which doesn't seem to be on a new line ? this is how 
I see it in the logs)

As you can see last notification time in the first entry = the notification 
time of the current notification attempt.

It's as if Naemon somewhere starts notifying and updates a lot of internal 
counters/variables (including last_notification_time) and then suddenly 
afterwards decides to check the notification_interval value and reconsiders.

5 minutes later exactly the same in the log except the last_notification is 
again set to the time of the current notification and so on ... basically 
rendering renotifications useless.

1442524062 = Thu, 17 Sep 2015 21:07:42 GMT = Thu, 17 Sep 2015 23:07:42 (local 
time)
[1442524062.422325] [032.0] [pid=28500] ** Service Notification Attempt ** 
Host: 'hostX', Service: 'serviceX', Type: NORMAL, Options: 0, Current State: 2, 
Last Notification: Thu Sep 17 23:07:42 2015
[1442524062.422370] [032.0] [pid=28500] SERVICE NOTIFICATION SUPPRESSED: 
hostX;serviceX;Re-notification blocked for this problem because not enough time 
has passed since last notification.[1442524062.422382] [032.1] [pid=28500] Next 
valid notification time: Thu Sep 17 23:12:42 2015

As usual: as far as I can tell the config is ok.
Naemon Core 2015.c.1 (obtained from git5 repo)
Any pointed would be appreciated
Frederik Vervaet
UNIX Monitoring (HES) | ITS/IIS
http://mon.bc

[Proximus]<http://www.proximus.be/>

Connect with us on:

[Proximus Facebook]<https://www.facebook.com/proximusBe>   [Proximus Twitter] 
<https://twitter.com/proximus>    [Proximus YouTube] 
<https://www.youtube.com/proximus>    [Proximus LinkedIn] 
<https://www.linkedin.com/company/proximus>


________________________________

***** Disclaimer *****
http://www.proximus.be/maildisclaimer

Reply via email to