Re: [Nagios-users] Notifications severely impacting Nagios performance. Ideas?

2009-08-25 Thread Jim Avery
2009/8/25 Ryan Bowlby rbowlb...@yahoo.com:
 Hello Everyone,

 We are experiencing issues with Nagios notifications. Apparently when a large 
 amount of services become critical and trigger notifications the system all 
 but stops processing new service check results (passive). The last check 
 times in the web interface stop updating until notifications are disabled.

 Tailing the nagios.log file during this issue shows mostly notifications and 
 hardly any check results being processed. As soon as I temporarily disable 
 notifications - system wide - the passive check results literally fly off the 
 screen, and all the stale results in the web interface clear up quickly.

 How does everyone handle this issue? Obviously the notifications are a 
 bottleneck. Has anyone configured notifications to be dropped in a file or 
 fifo to be processed by an external daemon? Something like OCP daemon if 
 anyone is familiar with it.

 Obviously parent child relationships would prove useful in curbing the 
 notifications, but implementation of relationships is a while off (5 D.C.s 
 lot's a data to have to collect). Any suggestions would be greatly 
 appreciated!



If you have set retry_check_interval to a very low value compared with
normal_check_interval, then inevitably Nagios will be working a lot
harder if a lot of hosts are down at the same time.



hth (a bit),

Jim

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Notifications severely impacting Nagios performance. Ideas?

2009-08-25 Thread Marc Powell

On Aug 24, 2009, at 9:43 PM, Ryan Bowlby wrote:

 Hello Everyone,

 We are experiencing issues with Nagios notifications. Apparently  
 when a large amount of services become critical and trigger  
 notifications the system all but stops processing new service check  
 results (passive). The last check times in the web interface stop  
 updating until notifications are disabled.

 Tailing the nagios.log file during this issue shows mostly  
 notifications and hardly any check results being processed. As soon  
 as I temporarily disable notifications - system wide - the passive  
 check results literally fly off the screen, and all the stale  
 results in the web interface clear up quickly.

I can confirm this behavior, at least under 2.x. I haven't had a major  
outage under 3.x yet to see if it still happens. My previous  
speculation on the issue is at the link below. I don't have enough  
knowledge of the code to be certain that's what's really happening  
though. Since then, I've come to suspect that it might actually be the  
logic to determine who gets notified rather than the actual  
notification itself.

http://www.mail-archive.com/nagios-users@lists.sourceforge.net/msg10803.html

 How does everyone handle this issue?

I tried backgrounding the notification script but that didn't help. We  
recognize that in the event of a major outage, we don't want to  
receive those notifications anyway so we disable notifications program- 
wide until the outage is over. Not a fix, per-se, but a workable  
process for us. It just doesn't happen often enough to really  
troubleshoot or spend a lot of time on.

--
Marc

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Notifications severely impacting Nagios performance. Ideas?

2009-08-25 Thread Arlen Drina
Hello Ryan,

I do not know what is organization of your network, but maybe you
could take a look into service/host dependency.
It depends on your network organization, but I think it could help

http://nagios.sourceforge.net/docs/2_0/dependencies.html

its purpose is to suppress notifications in case main host is down,
in such case you get only notification about that problem.

As I beleive that you have some gateway/router and in case it has some
problems you are getting many notifications, and I think it
is rara that many hosts have some local problem at same time

Hope this helps

Nice regards,

Arlen

On Tue, Aug 25, 2009 at 4:43 AM, Ryan Bowlbyrbowlb...@yahoo.com wrote:
 Hello Everyone,

 We are experiencing issues with Nagios notifications. Apparently when a large 
 amount of services become critical and trigger notifications the system all 
 but stops processing new service check results (passive). The last check 
 times in the web interface stop updating until notifications are disabled.

 Tailing the nagios.log file during this issue shows mostly notifications and 
 hardly any check results being processed. As soon as I temporarily disable 
 notifications - system wide - the passive check results literally fly off the 
 screen, and all the stale results in the web interface clear up quickly.

 How does everyone handle this issue? Obviously the notifications are a 
 bottleneck. Has anyone configured notifications to be dropped in a file or 
 fifo to be processed by an external daemon? Something like OCP daemon if 
 anyone is familiar with it.

 Obviously parent child relationships would prove useful in curbing the 
 notifications, but implementation of relationships is a while off (5 D.C.s 
 lot's a data to have to collect). Any suggestions would be greatly 
 appreciated!

 -Ryan B.

 ==

 FreeBSD 7.0-RELEASE amd64, Nagios 3.0 w/ NDO2DB 1.4b7
 Total Hosts: 3741, Total Services: 9321





 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
 trial. Simplify your report design, integration and deployment - and focus on
 what you do best, core application coding. Discover what's new with
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Notifications severely impacting Nagios performance. Ideas?

2009-08-25 Thread Sean McAfee
Ryan Bowlby wrote:
 How does everyone handle this issue? Obviously the notifications are a 
 bottleneck. Has anyone configured notifications to be dropped in a file or 
 fifo to be processed by an external daemon? Something like OCP daemon if 
 anyone is familiar with it.

I've experienced the exact same behavior on both 2.x and 3.x when remote 
sites go down.  Since each passive result takes 10s to time out, the 
queue just becomes so backed up it stops being useful.

OCP Daemon fixed it for me.

-- 
Sean McAfee
System Engineer

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Notifications severely impacting Nagios performance. Ideas?

2009-08-24 Thread Ryan Bowlby
Hello Everyone,

We are experiencing issues with Nagios notifications. Apparently when a large 
amount of services become critical and trigger notifications the system all but 
stops processing new service check results (passive). The last check times in 
the web interface stop updating until notifications are disabled.

Tailing the nagios.log file during this issue shows mostly notifications and 
hardly any check results being processed. As soon as I temporarily disable 
notifications - system wide - the passive check results literally fly off the 
screen, and all the stale results in the web interface clear up quickly.

How does everyone handle this issue? Obviously the notifications are a 
bottleneck. Has anyone configured notifications to be dropped in a file or fifo 
to be processed by an external daemon? Something like OCP daemon if anyone is 
familiar with it.

Obviously parent child relationships would prove useful in curbing the 
notifications, but implementation of relationships is a while off (5 D.C.s 
lot's a data to have to collect). Any suggestions would be greatly appreciated!

-Ryan B.

==

FreeBSD 7.0-RELEASE amd64, Nagios 3.0 w/ NDO2DB 1.4b7
Total Hosts: 3741, Total Services: 9321



  

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null