On Aug 16, 2019, at 4:55 AM, Daniel Comnea <comnea.d...@gmail.com> wrote:
On Thu, Aug 15, 2019 at 7:46 PM Clayton Coleman <ccole...@redhat.com> wrote: > > > On Aug 15, 2019, at 12:25 PM, Daniel Comnea <comnea.d...@gmail.com> wrote: > > Hi Clayton, > > Certainly some of the metrics should be preserved across reloads, e.g. > metrics like *haproxy_server_http_responses_total *should be preserved > across reload (though to an extent, Prometheus can handle resets correctly > with its native support). > > However, the metric > *haproxy_server_http_average_response_latency_milliseconds* appears also > to be accumulating when we wouldn't expect it to. (According the the > haproxy stats, I think that's a rolling average over the last 1024 calls -- > so it goes up and down, or should.) > > > File a bug with more details, can’t say off the top of my head > [DC]: thank you, do you have a preference/ suggestion where i should open > it for OKD ? i guess BZ is not the suitable for OKD, or am i wrong ? > There should be BZ components for origin > Thoughts? > > > Cheers, > Dani > > > On Thu, Aug 15, 2019 at 3:59 PM Clayton Coleman <ccole...@redhat.com> > wrote: > >> Metrics memory use in the router should be proportional to number of >> services, endpoints, and routes. I doubt it's leaking there and if it were >> it'd be really slow since we don't restart the router monitor process >> ever. Stats should definitely be preserved across reloads, but will not be >> preserved across the pod being restarted. >> >> On Thu, Aug 15, 2019 at 10:30 AM Dan Mace <dm...@redhat.com> wrote: >> >>> >>> >>> On Thu, Aug 15, 2019 at 10:03 AM Daniel Comnea <comnea.d...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> Would appreciate if anyone can please confirm that my understanding is >>>> correct w.r.t the way the router haproxy image [1] is built. >>>> Am i right to assume that the image [1] is is built as it's seen >>>> without any other layer being added to include [2] ? >>>> Also am i right to say the haproxy metrics [2] is part of the origin >>>> package ? >>>> >>>> >>>> A bit of background/ context: >>>> >>>> a while back on OKD 3.7 we had to swap the openshift 3.7.2 router image >>>> with 3.10 because we were seeing some problems with the reload and so we >>>> wanted to take the benefit of the native haproxy 1.8 reload feature to stop >>>> affecting the traffic. >>>> >>>> While everything was nice and working okay we've noticed recently that >>>> the haproxy stats do slowly increase and we do wonder if this is an >>>> accumulation or not cause (maybe?) by the reloads. Now i'm aware of a >>>> change made [3] however i suspect that is not part of the 3.10 image hence >>>> my question to double check if my understanding is wrong or not. >>>> >>>> >>>> Cheers, >>>> Dani >>>> >>>> [1] >>>> https://github.com/openshift/origin/tree/release-3.10/images/router/haproxy >>>> [2] >>>> https://github.com/openshift/origin/tree/release-3.10/pkg/router/metrics >>>> [3] >>>> https://github.com/openshift/origin/commit/8f0119bdd9c3b679cdfdf2962143435a95e08eae#diff-58216897083787e1c87c90955aabceff >>>> _______________________________________________ >>>> dev mailing list >>>> dev@lists.openshift.redhat.com >>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/dev >>>> >>> >>> I think Clayton (copied) has the history here, but the nature of the >>> metrics commit you referenced is that many of the exposed metrics points >>> are counters which were being reset across reloads. The patch was (I think) >>> to enable counter metrics to correctly aaccumulate across reloads. >>> >>> As to how the image itself is built, the pkg directly is part of the >>> router controller code included with the image. Not sure if that answers >>> your question. >>> >>> -- >>> >>> Dan Mace >>> >>> Principal Software Engineer, OpenShift >>> >>> Red Hat >>> >>> dm...@redhat.com >>> >>> >>>
_______________________________________________ dev mailing list dev@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/dev