Hi,

we have the following setup:

Nagios 2.5
73 hosts
335 active checks
765 passive checks

Each host submits its passive check results every 5 minutes into the 
nagios command file. The freshness threshold is set to 2000 seconds,
so the service stays passive, when everything runs as it should.
The check_command is defined as check_dummy, which generates the
message "No data from host" when executed actively.

This works fine for nearly every host. But there is one host, which
is not different from the others, that makes trouble. The data is
coming in every 5 minutes, but Nagios keeps flipping between active
and passive mode:

[EMAIL PROTECTED] tail -10000 nagios.log | grep rana | grep disk | naglog.pl  | 
cut -c-100
[01.09. 12:29:04]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 12:31:49]  SERVICE ALERT: rana;Local disk;UNKNOWN;HARD;1;UNKNOWN: No 
data from host - nsce n
[01.09. 12:34:24]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 12:34:29]  SERVICE ALERT: rana;Local disk;OK;HARD;1;DISK OK - free 
space:
[01.09. 12:39:44]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 12:41:50]  SERVICE ALERT: rana;Local disk;UNKNOWN;HARD;1;UNKNOWN: No 
data from host - nsce n
[01.09. 12:45:04]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 12:45:10]  SERVICE ALERT: rana;Local disk;OK;HARD;1;DISK OK - free 
space:
[01.09. 12:50:29]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 12:51:49]  SERVICE ALERT: rana;Local disk;UNKNOWN;HARD;1;UNKNOWN: No 
data from host - nsce n
[01.09. 12:55:49]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 12:55:59]  SERVICE ALERT: rana;Local disk;OK;HARD;1;DISK OK - free 
space:
[01.09. 13:01:09]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 13:01:49]  SERVICE ALERT: rana;Local disk;UNKNOWN;HARD;1;UNKNOWN: No 
data from host - nsce n
[01.09. 13:06:29]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 13:06:39]  SERVICE ALERT: rana;Local disk;OK;HARD;1;DISK OK - free 
space:
[01.09. 13:11:50]  SERVICE ALERT: rana;Local disk;UNKNOWN;HARD;1;UNKNOWN: No 
data from host - nsce n
[01.09. 13:11:50]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 13:11:50]  SERVICE ALERT: rana;Local disk;OK;HARD;1;DISK OK - free 
space:
[01.09. 13:17:09]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 13:21:49]  SERVICE ALERT: rana;Local disk;UNKNOWN;HARD;1;UNKNOWN: No 
data from host - nsce n
[01.09. 13:22:30]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 13:22:39]  SERVICE ALERT: rana;Local disk;OK;HARD;1;DISK OK - free 
space:
[01.09. 13:27:54]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 13:31:49]  SERVICE ALERT: rana;Local disk;UNKNOWN;HARD;1;UNKNOWN: No 
data from host - nsce n
[01.09. 13:33:14]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;rana;Local 
disk;0;DISK OK - free s
[01.09. 13:33:19]  SERVICE ALERT: rana;Local disk;OK;HARD;1;DISK OK - free 
space:

All other hosts work just fine:

[EMAIL PROTECTED] tail -10000 nagios.log | grep etamin | grep disk | naglog.pl  
| cut -c-100
[01.09. 12:30:49]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 12:36:09]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 12:41:29]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 12:46:50]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 12:52:09]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 12:57:29]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 13:02:49]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 13:08:09]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 13:13:34]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 13:18:54]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 13:24:14]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 13:29:34]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free
[01.09. 13:34:54]  EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;etamin;Local 
disk;0;DISK OK - free

The configuration is:

define service {
        use                             check_local_disk_templ
        host_name                       rana
}
define service{
        use                             passive_templ
        name                            check_local_disk_templ
        service_description             Local disk
        servicegroups                   local-disk-services
        check_command                   check_disk!-w 15% -c 10% -x /afs -e
        notifications_enabled           1
        register                        0
}
define service{
        name                            passive_templ
        register                        0
        max_check_attempts              1
        normal_check_interval           10
        retry_check_interval            1
        active_checks_enabled           0
        passive_checks_enabled          1
        check_freshness                 1
        freshness_threshold             2000
        check_period                    always
        notification_interval           0
        notification_period             always
        notification_options            w,c,r,f
        notifications_enabled           0
        contact_groups                  Server
        process_perf_data               1
}
define command{
        command_name            check_disk
        command_line            $USER1$/check_dummy 3 "No data from host - nsce 
not running?"
}

Any ideas what is going wrong here? Why is Nagios flipping the
service to active when data has arrived less than 2000 seconds ago?

Thanks,

Christopher

-- 
======================================================
    Dipl.-Ing. Christopher Odenbach
    Zentrum fuer Informations- und Medientechnologien
    Universitaet Paderborn
    Raum N5.110
    [EMAIL PROTECTED]
    Tel.: +49 5251 60 5315
======================================================

Attachment: pgpTLO2cEV8YT.pgp
Description: PGP signature

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Reply via email to