Re: [Nagios-users] Nagios 3.0.6 process hangs, then recovers

2009-07-28 Thread Andrew Noonan
It's not in a VM, and I haven't been able to catch it when it it
actually happening yet.  BTW, the system is running CentOS 4.7

On Tue, Jul 28, 2009 at 8:52 AM, Brian A.
Sekleckisekle...@noc.cfi.pgh.pa.us wrote:
 On Mon, 2009-07-27 at 17:33 -0500, Andrew Noonan wrote:
 have something to do with things... perhaps a hang in that module\

 Did you ktrace(8)/strace(8) it out, yet?  You're not running in a VM are
 you?  ~BAS




--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Nagios 3.0.6 process hangs, then recovers

2009-07-27 Thread Andrew Noonan
Hi all,

This is the second time this has happened to me... Nagios is working
fine when suddenly it stops monitoring.  Hours later, the process
un-hangs and a message like:

[1248653042] Warning: A system time change of 0d 3h 39m 9s (forwards
in time) has been detected.  Compensating...

happens, followed by nagios complaining about orphaned checks and a
rescheduling.  The last time I published to the list, someone
suggested that this was a time zone problem... that an actual system
time change had occurred, but let me assure you, no such thing
happened.  Other process log files continued to log uninterrupted
during this time, and there is no individual TZ setting for the nagios
user.  No NTP messages, etc.  Plus, other then NTP problems, a TZ
change would likely be a multiple of an hour or half hour.  That being
said, has anyone ever had these problems?  I've had two of these in a
month.  The system was not loaded during this period, with plenty of
memory and CPU to spare.  I'm also running ndo2db, which I worry may
have something to do with things... perhaps a hang in that module
causes the process to spin?  The system has about 1000 checks per 5
minutes, so not overwhelmingly busy.

This is the last major hurdle before I begin using Nagios in
production, but this is a pretty big problem.  Any advice would help.

Thanks,
Andrew

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null