Re: [3.x]: openshift router and its own metrics

Clayton Coleman Fri, 16 Aug 2019 18:26:01 -0700

On Aug 16, 2019, at 4:55 AM, Daniel Comnea <[email protected]> wrote:




On Thu, Aug 15, 2019 at 7:46 PM Clayton Coleman <[email protected]> wrote:

>
>
> On Aug 15, 2019, at 12:25 PM, Daniel Comnea <[email protected]> wrote:
>
> Hi Clayton,
>
> Certainly some of the metrics should be preserved across reloads, e.g.
> metrics like *haproxy_server_http_responses_total *should be preserved
> across reload (though to an extent, Prometheus can handle resets correctly
> with its native support).
>
> However, the metric
> *haproxy_server_http_average_response_latency_milliseconds* appears also
> to be accumulating when we wouldn't expect it to. (According the the
> haproxy stats, I think that's a rolling average over the last 1024 calls --
> so it goes up and down, or should.)
>
>
> File a bug with more details, can’t say off the top of my head
> [DC]: thank you, do you have a preference/ suggestion where i should open
> it for OKD ? i guess BZ is not the suitable for OKD, or am i wrong ?
>

There should be BZ components for origin


> Thoughts?
>
>
> Cheers,
> Dani
>
>
> On Thu, Aug 15, 2019 at 3:59 PM Clayton Coleman <[email protected]>
> wrote:
>
>> Metrics memory use in the router should be proportional to number of
>> services, endpoints, and routes.  I doubt it's leaking there and if it were
>> it'd be really slow since we don't restart the router monitor process
>> ever.  Stats should definitely be preserved across reloads, but will not be
>> preserved across the pod being restarted.
>>
>> On Thu, Aug 15, 2019 at 10:30 AM Dan Mace <[email protected]> wrote:
>>
>>>
>>>
>>> On Thu, Aug 15, 2019 at 10:03 AM Daniel Comnea <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Would appreciate if anyone can please confirm that my understanding is
>>>> correct w.r.t the way the router haproxy image [1] is built.
>>>> Am i right to assume that the image [1] is is built as it's seen
>>>> without any other layer being added to include [2] ?
>>>> Also am i right to say the haproxy metrics [2] is part of the origin
>>>> package ?
>>>>
>>>>
>>>> A bit of background/ context:
>>>>
>>>> a while back on OKD 3.7 we had to swap the openshift 3.7.2 router image
>>>> with 3.10 because we were seeing some problems with the reload and so we
>>>> wanted to take the benefit of the native haproxy 1.8 reload feature to stop
>>>> affecting the traffic.
>>>>
>>>> While everything was nice and working okay we've noticed recently that
>>>> the haproxy stats do slowly increase and we do wonder if this is an
>>>> accumulation or not cause (maybe?) by the reloads. Now i'm aware of a
>>>> change made [3] however i suspect that is not part of the 3.10 image hence
>>>> my question to double check if my understanding is wrong or not.
>>>>
>>>>
>>>> Cheers,
>>>> Dani
>>>>
>>>> [1]
>>>> https://github.com/openshift/origin/tree/release-3.10/images/router/haproxy
>>>> [2]
>>>> https://github.com/openshift/origin/tree/release-3.10/pkg/router/metrics
>>>> [3]
>>>> https://github.com/openshift/origin/commit/8f0119bdd9c3b679cdfdf2962143435a95e08eae#diff-58216897083787e1c87c90955aabceff
>>>> _______________________________________________
>>>> dev mailing list
>>>> [email protected]
>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
>>>>
>>>
>>> I think Clayton (copied) has the history here, but the nature of the
>>> metrics commit you referenced is that many of the exposed metrics points
>>> are counters which were being reset across reloads. The patch was (I think)
>>> to enable counter metrics to correctly aaccumulate across reloads.
>>>
>>> As to how the image itself is built, the pkg directly is part of the
>>> router controller code included with the image. Not sure if that answers
>>> your question.
>>>
>>> --
>>>
>>> Dan Mace
>>>
>>> Principal Software Engineer, OpenShift
>>>
>>> Red Hat
>>>
>>> [email protected]
>>>
>>>
>>>

_______________________________________________
dev mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/dev

Re: [3.x]: openshift router and its own metrics

Reply via email to