Dear Michael!
There is a part of my Icinga log related to an example host and service:
[1480225286] HOST ALERT: myhost;DOWN;SOFT;1;(Host Check Timed Out)
[1480225946] HOST ALERT: myhost;DOWN;SOFT;2;(Host Check Timed Out)
[1480226607] HOST ALERT: myhost;DOWN;HARD;3;(Host Check Timed Out)
[1480226607] HOST NOTIFICATION:
netmonitor;myhost;DOWN;host-notify-by-email;(Host Check Timed Out)
[1480226608] HOST NOTIFICATION:
mobil-net;myhost;DOWN;host-notify-by-epager-snpp;(Host Check Timed Out)
[1480403736] SERVICE ALERT: voip.myhost;VoIP E1 2;WARNING;SOFT;1;E1
3/1:---=== VOICE interface2 ===---:UPWARN : ES=3WARN : PCV=6WARN :
LES=3WARN : LCV=3:(1 UP): WARNING
[1480404446] SERVICE ALERT: voip.myhost;VoIP E1 2;WARNING;SOFT;2;E1
3/1:---=== VOICE interface2 ===---:UPWARN : ES=3WARN : PCV=6WARN :
LES=3WARN : LCV=3:(1 UP): WARNING
[1480405156] SERVICE ALERT: voip.myhost;VoIP E1 2;WARNING;HARD;3;E1
3/1:---=== VOICE interface2 ===---:UPWARN : ES=3WARN : PCV=6WARN :
LES=3WARN : LCV=3:(1 UP): WARNING
[1480405156] SERVICE NOTIFICATION: netmonitor;voip.myhost;VoIP E1
2;WARNING;notify-by-email;E1 3/1:---=== VOICE interface2 ===---:UPWARN :
ES=3WARN : PCV=6WARN : LES=3WARN : LCV=3:(1 UP): WARNING
[1480405156] SERVICE NOTIFICATION: voip-admin;voip.myhost;VoIP E1
2;WARNING;notify-by-email;E1 3/1:---=== VOICE interface2 ===---:UPWARN :
ES=3WARN : PCV=6WARN : LES=3WARN : LCV=3:(1 UP): WARNING
661 seconds (11 minutes) is the difference instead of 2 minutes at host
check and 710 (~12) minutes instead of 2 minutes at service check (in my
previous mail I have written wrong value for service retry_interval). As
far as I know, we are not using any external check addon. Now 3-4 parallel
checks/second are performed, there is no any perfermance issues observed.
Related content of objects.cache file:
define host {
host_name myhost
alias myhost
address 195.111.100.78
check_command check_ping_1G
contact_groups netadmin+sms
notification_period 24x7
initial_state o
check_interval 4.000000
retry_interval 2.000000
max_check_attempts 3
active_checks_enabled 1
passive_checks_enabled 1
obsess_over_host 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options o,d,u
freshness_threshold 0
check_freshness 0
notification_options d,r
notifications_enabled 1
notification_interval 0.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
failure_prediction_enabled 1
retain_status_information 1
retain_nonstatus_information 1
}
define service {
host_name voip.myhost
service_description VoIP E1 2
check_period 24x7
check_command check_snmp_int_E1!<IP>!---=== VOICE interface2 ===---
contact_groups netadmin_only_critical,voip-admin
notification_period 24x7
initial_state o
check_interval 7.000000
retry_interval 2.000000
max_check_attempts 3
is_volatile 0
parallelize_check 1
active_checks_enabled 1
passive_checks_enabled 1
obsess_over_service 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options o,w,u,c
freshness_threshold 0
check_freshness 0
notification_options u,w,c,r
notifications_enabled 1
notification_interval 0.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
failure_prediction_enabled 1
retain_status_information 1
retain_nonstatus_information 1
}
Thank you for answer!
Best regards:
Gergo
2016-11-27 12:26 GMT+01:00 Michael Friedrich <[email protected]>:
>
> > On 26 Nov 2016, at 18:10, Szabó Gergő <[email protected]> wrote:
> >
> > Hi all!
> >
> > I have a problem with Icinga scheduler's functionality. Once the host is
> unreachable, scheduler isn't start to using the retry_interval to check
> host's reachability, but it still using check_interval. Because the
> max_check_attempt parameter is set to 3, after 3 attempts the system
> changes from soft to hard state and sends notification by e-mail. If I know
> well, the usual behaviour should be (by config listed below), that after
> first check failure, the system changes from check_interval (4 minutes) to
> retry_interval (2 minutes) and until max_check_attemp -1 (2) times it
> checks the host's reachablity. If the host is still unreachable (after 8
> minutes), Icinga sends notification about it. But, in real case it sends
> after 12 minutes.
> > For service checks there is also a problem in scheduling - in template
> is configured 7 minutes as normal check_interval and 3 minutes for
> retry_interval, like in host's config, the max_check_attempts parameter is
> set to 3 - it should notify after 13 minutes. In reality, the scheduler
> perform checking with fix ~5 minutes interval, so I get notification after
> 15 minutes.
> > Parallel service checks are calculated by system and as I checked, the
> value is correct. I have tried to set it manually, but with no success.
>
> It’s hard to follow a text blob when a timeline helps more. Do you have
> some logs which help proof the point? In addition to that please attach the
> compiled objects from your objects.cache file. Your text differs from the
> templates attached.
>
> Are you using any sorts of external check addons such as mod_gearman?
> How’s your overall check latency/execution time? Any performance issues you
> recognise on that host (I/O, load, etc.)?
>
> Kind regards,
> Michael
>
>
>
>
> >
> > I have also a running Icinga v2 instance with similar values and it
> works perfectly in scheduling (v2 instance will replace the old v1 system,
> now it's running only in test mode).
> > My main problem is that I feel, as if I don't have any influence in
> checking hosts and services.
> >
> > Do you have any idea, why the scheduler works as it does?
> >
> > --------------------------------
> >
> > There are my Icinga templates, all of hosts and services are using them :
> >
> > define host {
> > name general_host
> > notifications_enabled 1
> > event_handler_enabled 1
> > flap_detection_enabled 1
> > process_perf_data 1
> > retain_status_information 1
> > retain_nonstatus_information 1
> > check_interval 4
> > retry_interval 2
> > max_check_attempts 3
> > notification_interval 0
> > notification_period 24x7
> > notification_options d,r
> > register 0
> > }
> >
> > define service {
> > name general_service
> > is_volatile 0
> > check_interval 7
> > retry_interval 3
> > max_check_attempts 3
> > active_checks_enabled 1
> > passive_checks_enabled 0
> > check_period 24x7
> > parallelize_check 1
> > obsess_over_service 0
> > check_freshness 0
> > event_handler_enabled 1
> > flap_detection_enabled 1
> > process_perf_data 1
> > retain_status_information 1
> > retain_nonstatus_information 1
> > notification_interval 0
> > notification_period 24x7
> > notification_options w,u,c,r
> > notifications_enabled 1
> > contact_groups admin
> > register 0
> > }
> >
> >
> > Thank you in advance for any answers! :)
> > _______________________________________________
> > icinga-users mailing list
> > [email protected]
> > https://lists.icinga.org/mailman/listinfo/icinga-users
>
>
> --
> Michael Friedrich, DI (FH)
> Senior Developer
>
> NETWAYS GmbH | Deutschherrnstr. 15-19 | D-90429 Nuernberg
> Tel: +49 911 92885-0 | Fax: +49 911 92885-77
> CEO: Julian Hein, Bernd Erk | AG Nuernberg HRB18461
> http://www.netways.de | [email protected]
>
> ** OSMC 2016 - November - netways.de/osmc **
> ** OSDC 2017 - Mai – osdc.de **
> _______________________________________________
> icinga-users mailing list
> [email protected]
> https://lists.icinga.org/mailman/listinfo/icinga-users
>
_______________________________________________
icinga-users mailing list
[email protected]
https://lists.icinga.org/mailman/listinfo/icinga-users