Hi,
> On 19. Sep 2017, at 17:15, Jered Floyd <[email protected]> wrote:
>
>
> Michael,
>
> Excellent intuition! This looks very much like an issue with the
> InfluxdbWriter queue. It looks like Icinga loses the connection and doesn't
> attempt to reconnect, but queues up all the data indefinitely. TLS is
> enabled, and the configuration is below. I'm guessing this is
> https://github.com/Icinga/icinga2/issues/5469 ?
Yep, my question was to find out if TLS is used, so I could point you to the
issue. 2.7.1 is going to be released tomorrow, if nothing else happens.
Kind regards,
Michael
>
> Regards,
> --Jered
>
>
> From the most recent instance (filtered to InfluxDB-related messages):
>
> [2017-09-18 10:13:11 -0400] information/WorkQueue: #5 (InfluxdbWriter,
> influxdb) items: 0, rate: 6.03333/s (362/min 1814/5min 5446/15min);
> [2017-09-18 10:18:21 -0400] information/WorkQueue: #5 (InfluxdbWriter,
> influxdb) items: 0, rate: 6.01667/s (361/min 1813/5min 5443/15min);
> [2017-09-18 10:23:31 -0400] information/WorkQueue: #5 (InfluxdbWriter,
> influxdb) items: 0, rate: 5.98333/s (359/min 1814/5min 5441/15min);
> [2017-09-18 10:26:58 -0400] warning/InfluxdbWriter: Response timeout of TCP
> socket from host '127.0.0.1' port '8086'.
> [2017-09-18 10:28:21 -0400] information/WorkQueue: #5 (InfluxdbWriter,
> influxdb) items: 10, rate: 6.01667/s (361/min 1810/5min 5440/15min);
> [2017-09-18 10:28:31 -0400] information/WorkQueue: #5 (InfluxdbWriter,
> influxdb) items: 68, rate: 6.01667/s (361/min 1810/5min 5440/15min); empty in
> 11 seconds
> [2017-09-18 10:28:41 -0400] information/WorkQueue: #5 (InfluxdbWriter,
> influxdb) items: 132, rate: 6.01667/s (361/min 1810/5min 5440/15min); empty
> in 20 seconds
> [2017-09-18 10:28:51 -0400] information/WorkQueue: #5 (InfluxdbWriter,
> influxdb) items: 200, rate: 6.01667/s (361/min 1810/5min 5440/15min); empty
> in 29 seconds
>
> ... and the queue keeps growing from there. There are no errors noted in the
> InfluxDB logs.
>
>
> /etc/icinga2/features-enabled/influxdb.conf:
>
> /**
> * The InfluxdbWriter type writes check result metrics and
> * performance data to an InfluxDB HTTP API
> */
>
> library "perfdata"
>
> object InfluxdbWriter "influxdb" {
> host = "127.0.0.1"
> port = 8086
> ssl_enable = true
> database = "icinga2"
> username = "icinga2"
> password = "REDACTED"
>
> enable_send_thresholds = true
> enable_send_metadata = true
>
> host_template = {
> measurement = "$host.check_command$"
> tags = {
> hostname = "$host.name$"
> }
> }
> service_template = {
> measurement = "$service.check_command$"
> tags = {
> hostname = "$host.name$"
> service = "$service.name$"
> }
> }
> }
>
>
>
>
>
> ----- On Sep 19, 2017, at 9:02 AM, Michael Friedrich
> [email protected] wrote:
>
>>> On 19. Sep 2017, at 14:51, Jered Floyd <[email protected]> wrote:
>>>
>>>
>>> Icinga Users,
>>>
>>> I'm running Icinga 2.7.0 on Debian 8.9 (Jessie), using the packages from the
>>> official repository.
>>>
>>> I find that every few weeks Icinga uses up all of the available memory and
>>> sub-processes are killed by the OOM-killer repeatedly. (It balloons from an
>>> RSS of about 32M to 1GB+.)
>>>
>>> Data:
>>> 1) I haven't yet been able to strongly correlate this with any causative
>>> environmental factors.
>>
>> Any work queue metrics available which would show an increasing value
>> (ido-mysql, influxdb, etc.). You can query that via API /v1/status endpoint,
>> the “icinga” check or check that inside the logs.
>>
>>>
>>> 2) When this occurs general monitoring continues but statistics are no
>>> longer
>>> written via the InfluxdbWriter. Not sure if this is cause or effect.
>>
>> Please share the configuration for InfluxdbWriter, especially whether TLS is
>> enabled.
>>
>> Kind regards,
>> Michael
>>
>>
>>>
>>> 3) It seems to happen quite rapidly, as the final check_memory logged to
>>> InfluxDB shows 1.5 GB free, and a low memory alert is never triggered
>>> within
>>> Icinga.
>>>
>>> 4) There was a time when this problem did not exist (several months ago)
>>> but I
>>> cannot identify when specifically it started.
>>>
>>> Any suggestions on how to start debugging this issue? Unfortunately my
>>> gdb-fu
>>> is relatively weak....
>>>
>>> Thanks,
>>> --Jered
>>> _______________________________________________
>>> icinga-users mailing list
>>> [email protected]
>>> https://lists.icinga.org/mailman/listinfo/icinga-users
>>
>> _______________________________________________
>> icinga-users mailing list
>> [email protected]
>> https://lists.icinga.org/mailman/listinfo/icinga-users
> _______________________________________________
> icinga-users mailing list
> [email protected]
> https://lists.icinga.org/mailman/listinfo/icinga-users
_______________________________________________
icinga-users mailing list
[email protected]
https://lists.icinga.org/mailman/listinfo/icinga-users