Hi
I switched back the hosts state to OK manually and tail-greped the two
hosts' log.
[1253235965] API LOG: admin;PROCESS_HOST_CHECK_RESULT;gadgets01;0;test|
[1253235972] HOST ALERT: gadgets01;UP;HARD;1;test
[1253235972] HOST NOTIFICATION:
admin/distprofile;gadgets01;UP;notify-by-atom;test
[1253235978] SERVICE ALERT: gadgets01;Linux CPU
Usage;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds.
[1253235978] SERVICE ALERT: gadgets01;Linux Network
Usage;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds.
[1253236011] API LOG: admin;PROCESS_HOST_CHECK_RESULT;gadgets02;0;test|
[1253236013] HOST ALERT: gadgets02;UP;HARD;1;test
[1253236013] HOST NOTIFICATION:
admin/distprofile;gadgets02;UP;notify-by-atom;test
[1253236027] SERVICE ALERT: gadgets02;Linux Disk
Usage;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds.
[1253236037] SERVICE ALERT: gadgets02;Nagios Agent
check;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds.
[1253236037] SERVICE ALERT: gadgets01;Linux Load
Average;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds.
[1253236037] SERVICE ALERT: gadgets01;Nagios Agent
check;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds.
[1253236079] SERVICE ALERT: gadgets02;Linux Hardware
Spec;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds.
[1253236081] SERVICE ALERT: gadgets02;Linux TCP
Established;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds.
[1253236097] HOST ALERT: gadgets01;UNREACHABLE;SOFT;1;CRITICAL - 10.10.10.19:
rta nan, lost 100%
[1253236126] SERVICE ALERT: gadgets02;Linux Load
Average;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds.
[1253236126] SERVICE ALERT: gadgets02;Syslogd
Check;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 10 seconds.
[1253236158] HOST NOTIFICATION:
admin/distprofile;gadgetsdb01;UNREACHABLE;notify-by-atom;CRITICAL -
10.10.10.21: rta nan, lost 100%
[1253236186] HOST ALERT: gadgets02;DOWN;SOFT;1;CRITICAL - 10.10.10.20: rta
nan, lost 100%
There is no difference.
PS. I had modified hosts.cfg generating section of nagconfgen.pl
check_interval 0 ; For the moment, set
check_interval to 0 so hosts only checked on demand, like Nagios 2
0 to 5
2009/9/17 Ton Voon <[email protected]>
>
> On 17 Sep 2009, at 11:21, Kang wrote:
>
> In opsview 3.3.1
>> I have two hosts whose configuration have the same parent.
>>
>> When two hosts go down, their host states are different.
>> one is DOWN, the other is UNREACHABLE.
>>
>> why this problem happened??
>>
>
> That sounds strange. Can you provide relevant nagios.log entries around
> this time?
>
> I'm guessing that it could be a very deep nagios host logic problem (I note
> there are two parents for each of these hosts), but I'd need to know the
> recreation steps.
>
> Ton
>
> _______________________________________________
> Opsview-users mailing list
> [email protected]
> http://lists.opsview.org/lists/listinfo/opsview-users
>
_______________________________________________
Opsview-users mailing list
[email protected]
http://lists.opsview.org/lists/listinfo/opsview-users