[Nagios-users] Availability data from avail.cgi often incorrect due to 'SOFT' OK messages

Brandon Whaley Fri, 20 Jan 2012 11:51:29 -0800

I've been working on the cause of this issue for a few months and just
today had time to really dig in and science it.  When we pull data
from avail.cgi for reporting, often times a host that was recently
rebooted will show some services CRITICAL from the reboot until the
midnight log rotation for that day.  After some manual editing of the
archived history logs that avail.cgi is reading, I found that this
happens when a post-reboot service check logs a 'soft' OK but doesn't
happen for 'hard' OK messages.


Right now I've implemented a cron to run post-archive and simply
convert all SOFT OK entries to HARD OK entries, which has made my data
correct to the best of my knowledge.

I just have a couple questions about this:
1) Is this skewing my data in any way?  The timestamps look right
based on the observed length of service outage after doing this.
2) Is this a bug in avail.cgi or in the actual logging of this data?
Should there have been a second "HARD" OK status entry in the log or
was the first entry supposed to be "HARD"?

Also, I am on an older version of nagios than current (3.2.3).  If
I've wasted my time working around something fixed in a more recent
version, please let me know.  I'd prefer not to upgrade unless
necessary as the version we're on has been very stable excepting this
one bug.

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Availability data from avail.cgi often incorrect due to 'SOFT' OK messages

Reply via email to