Re: [icinga-users] Icinga v1 scheduler problem

Szabó Gergő Tue, 29 Nov 2016 04:14:41 -0800

Dear Michael!

There is a part of my Icinga log related to an example host and service:


[1480225286] HOST ALERT: myhost;DOWN;SOFT;1;(Host Check Timed Out)
[1480225946] HOST ALERT: myhost;DOWN;SOFT;2;(Host Check Timed Out)
[1480226607] HOST ALERT: myhost;DOWN;HARD;3;(Host Check Timed Out)
[1480226607] HOST NOTIFICATION:
netmonitor;myhost;DOWN;host-notify-by-email;(Host Check Timed Out)
[1480226608] HOST NOTIFICATION:
mobil-net;myhost;DOWN;host-notify-by-epager-snpp;(Host Check Timed Out)

[1480403736] SERVICE ALERT: voip.myhost;VoIP E1 2;WARNING;SOFT;1;E1
3/1:---=== VOICE interface2 ===---:UPWARN : ES=3WARN : PCV=6WARN :
LES=3WARN : LCV=3:(1 UP): WARNING
[1480404446] SERVICE ALERT: voip.myhost;VoIP E1 2;WARNING;SOFT;2;E1
3/1:---=== VOICE interface2 ===---:UPWARN : ES=3WARN : PCV=6WARN :
LES=3WARN : LCV=3:(1 UP): WARNING
[1480405156] SERVICE ALERT: voip.myhost;VoIP E1 2;WARNING;HARD;3;E1
3/1:---=== VOICE interface2 ===---:UPWARN : ES=3WARN : PCV=6WARN :
LES=3WARN : LCV=3:(1 UP): WARNING
[1480405156] SERVICE NOTIFICATION: netmonitor;voip.myhost;VoIP E1
2;WARNING;notify-by-email;E1 3/1:---=== VOICE interface2 ===---:UPWARN :
ES=3WARN : PCV=6WARN : LES=3WARN : LCV=3:(1 UP): WARNING
[1480405156] SERVICE NOTIFICATION: voip-admin;voip.myhost;VoIP E1
2;WARNING;notify-by-email;E1 3/1:---=== VOICE interface2 ===---:UPWARN :
ES=3WARN : PCV=6WARN : LES=3WARN : LCV=3:(1 UP): WARNING

661 seconds (11 minutes) is the difference instead of 2 minutes at host
check and 710 (~12) minutes instead of 2 minutes at service check (in my
previous mail I have written wrong value for service retry_interval). As
far as I know, we are not using any external check addon. Now 3-4 parallel
checks/second are performed, there is no any perfermance issues observed.

Related content of objects.cache file:

define host {
    host_name    myhost
    alias    myhost
    address    195.111.100.78
    check_command    check_ping_1G
    contact_groups    netadmin+sms
    notification_period    24x7
    initial_state    o
    check_interval    4.000000
    retry_interval    2.000000
    max_check_attempts    3
    active_checks_enabled    1
    passive_checks_enabled    1
    obsess_over_host    1
    event_handler_enabled    1
    low_flap_threshold    0.000000
    high_flap_threshold    0.000000
    flap_detection_enabled    1
    flap_detection_options    o,d,u
    freshness_threshold    0
    check_freshness    0
    notification_options    d,r
    notifications_enabled    1
    notification_interval    0.000000
    first_notification_delay    0.000000
    stalking_options    n
    process_perf_data    1
    failure_prediction_enabled    1
    retain_status_information    1
    retain_nonstatus_information    1
    }

define service {
    host_name    voip.myhost
    service_description    VoIP E1 2
    check_period    24x7
    check_command    check_snmp_int_E1!<IP>!---=== VOICE interface2 ===---
    contact_groups    netadmin_only_critical,voip-admin
    notification_period    24x7
    initial_state    o
    check_interval    7.000000
    retry_interval    2.000000
    max_check_attempts    3
    is_volatile    0
    parallelize_check    1
    active_checks_enabled    1
    passive_checks_enabled    1
    obsess_over_service    1
    event_handler_enabled    1
    low_flap_threshold    0.000000
    high_flap_threshold    0.000000
    flap_detection_enabled    1
    flap_detection_options    o,w,u,c
    freshness_threshold    0
    check_freshness    0
    notification_options    u,w,c,r
    notifications_enabled    1
    notification_interval    0.000000
    first_notification_delay    0.000000
    stalking_options    n
    process_perf_data    1
    failure_prediction_enabled    1
    retain_status_information    1
    retain_nonstatus_information    1
    }

Thank you for answer!


Best regards:
Gergo

2016-11-27 12:26 GMT+01:00 Michael Friedrich <[email protected]>:

>
> > On 26 Nov 2016, at 18:10, Szabó Gergő <[email protected]> wrote:
> >
> > Hi all!
> >
> > I have a problem with Icinga scheduler's functionality. Once the host is
> unreachable, scheduler isn't start to using the retry_interval to check
> host's reachability, but it still using check_interval. Because the
> max_check_attempt parameter is set to 3, after 3 attempts the system
> changes from soft to hard state and sends notification by e-mail. If I know
> well, the usual behaviour should be (by config listed below), that after
> first check failure, the system changes from check_interval (4 minutes) to
> retry_interval (2 minutes) and until max_check_attemp -1 (2) times it
> checks the host's reachablity. If the host is still unreachable (after 8
> minutes), Icinga sends notification about it. But, in real case it sends
> after 12 minutes.
> > For service checks there is also a problem in scheduling - in template
> is configured 7 minutes as normal check_interval and 3 minutes for
> retry_interval, like in host's config, the max_check_attempts parameter is
> set to 3 - it should notify after 13 minutes. In reality, the scheduler
> perform checking with fix ~5 minutes interval, so I get notification after
> 15 minutes.
> > Parallel service checks are calculated by system and as I checked, the
> value is correct. I have tried to set it manually, but with no success.
>
> It’s hard to follow a text blob when a timeline helps more. Do you have
> some logs which help proof the point? In addition to that please attach the
> compiled objects from your objects.cache file. Your text differs from the
> templates attached.
>
> Are you using any sorts of external check addons such as mod_gearman?
> How’s your overall check latency/execution time? Any performance issues you
> recognise on that host (I/O, load, etc.)?
>
> Kind regards,
> Michael
>
>
>
>
> >
> > I have also a running Icinga v2 instance with similar values and it
> works perfectly in scheduling (v2 instance will replace the old v1 system,
> now it's running only in test mode).
> > My main problem is that I feel, as if I don't have any influence in
> checking hosts and services.
> >
> > Do you have any idea, why the scheduler works as it does?
> >
> > --------------------------------
> >
> > There are my Icinga templates, all of hosts and services are using them :
> >
> > define host {
> >     name general_host
> >     notifications_enabled 1
> >     event_handler_enabled 1
> >     flap_detection_enabled 1
> >     process_perf_data 1
> >     retain_status_information 1
> >     retain_nonstatus_information 1
> >     check_interval 4
> >     retry_interval 2
> >     max_check_attempts 3
> >     notification_interval 0
> >     notification_period 24x7
> >     notification_options d,r
> >     register 0
> > }
> >
> > define service {
> >     name general_service
> >     is_volatile 0
> >     check_interval 7
> >     retry_interval 3
> >     max_check_attempts 3
> >     active_checks_enabled 1
> >     passive_checks_enabled 0
> >     check_period 24x7
> >     parallelize_check 1
> >     obsess_over_service 0
> >     check_freshness 0
> >     event_handler_enabled 1
> >     flap_detection_enabled 1
> >     process_perf_data 1
> >     retain_status_information 1
> >     retain_nonstatus_information 1
> >     notification_interval 0
> >     notification_period 24x7
> >     notification_options w,u,c,r
> >     notifications_enabled 1
> >     contact_groups admin
> >     register 0
> > }
> >
> >
> > Thank you in advance for any answers! :)
> > _______________________________________________
> > icinga-users mailing list
> > [email protected]
> > https://lists.icinga.org/mailman/listinfo/icinga-users
>
>
> --
> Michael Friedrich, DI (FH)
> Senior Developer
>
> NETWAYS GmbH | Deutschherrnstr. 15-19 | D-90429 Nuernberg
> Tel: +49 911 92885-0 | Fax: +49 911 92885-77
> CEO: Julian Hein, Bernd Erk | AG Nuernberg HRB18461
> http://www.netways.de | [email protected]
>
> ** OSMC 2016 - November - netways.de/osmc **
> ** OSDC 2017 - Mai – osdc.de **
> _______________________________________________
> icinga-users mailing list
> [email protected]
> https://lists.icinga.org/mailman/listinfo/icinga-users
>

_______________________________________________
icinga-users mailing list
[email protected]
https://lists.icinga.org/mailman/listinfo/icinga-users

Re: [icinga-users] Icinga v1 scheduler problem

Reply via email to