Mark Creamer wrote:
> We have a jobs script that runs nightly on one of our production servers.
on a question like this, always provide the initial information
- os used
- src or pkg install
- versions used
- logs on errors
> For each task that happens in the script, there is a send_nsca command
> which in turn updates the Icinga server. Icinga has services defined
> for each of these, as a passive. Each one looks something like this:
>
> define service{
> use passive_service_24x7
> service_description TestMessage
> host_name localhost
> passive_checks_enabled 1
> active_checks_enabled 0
> check_freshness 1
> freshness_threshold 90000
> check_command service_is_stale
> }
>
> service_is_stale.sh looks like this:
>
> #!/bin/bash
> /bin/echo "CRITICAL: No Current Results for this Service"
> exit 2
>
> The problem I'm having is this: when the script runs nightly and all
> of the alerts get updated, I'm expecting them to stay that way for the
> 90,000 second timeout defined in the service description. However,
> they do not wait that long, and will all go critical before the next
> night's run. I should also explain why I did 90,000, since that may be
> part of my problem. 90,000 seconds is 25 hours, or one hour later than
> the next time the job should run. My understanding of the send_nsca
> process is that the timer will restart once a message is received. So
> if the message is received by 11:15 PM, the 90,000 seconds will start
> then, and Icinga should not show this service as critical unless no
> new message is received by 12:15 AM the next night.
when there core receives the passive checkresults put through the
command pipe from nsca (retrieving that via send_nsca), then the
freshness_threshold is being reset. nsca itsself is not really the
indikator for the timestamp happening (could be stuffed external command
pipe in case of emergency).
either way, given that you chose 90k seconds, how does that look like in
your logs then when the freshness_threshold triggers?
>
> I would really rather use a method where each service checks at an
> exact time (active) rather than wait for a timeout (passive).
if you want exact times, use cron. the core scheduler does not allow to
set defined check times, but depends on latency, etc doing the
check_intervals.
> I'd be open to any other suggestions, or if it looks like I'm not
> using send_nsca and passive checks correctly.
timezone problems? different timestamps (past or future being dropped?).
what's the exact error you got?
> Thanks,
>
>
>
> --
> Mark
>
>
> ------------------------------------------------------------------------------
> Virtualization& Cloud Management Using Capacity Planning
> Cloud computing makes use of virtualization - but cloud computing
> also focuses on allowing computing to be delivered as a service.
> http://www.accelacomm.com/jaw/sfnl/114/51521223/
>
>
> _______________________________________________
> icinga-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/icinga-users
--
DI (FH) Michael Friedrich
Vienna University Computer Center
Universitaetsstrasse 7 A-1010 Vienna, Austria
email: [email protected]
phone: +43 1 4277 14359
mobile: +43 664 60277 14359
fax: +43 1 4277 14338
web: http://www.univie.ac.at/zid
http://www.aco.net
Lead Icinga Core Developer
http://www.icinga.org
------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
icinga-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/icinga-users