Re: [PATCH] MINOR: promex: backend aggregated server check status
Le 11/8/21 à 14:31, William Dauchy a écrit : On Mon, Nov 8, 2021 at 1:52 PM Willy Tarreau wrote: Just to be sure, is this something you want to merge into 2.5 or is it to be queued next ? I'm fine with both, but I prefer to ask as it's not just a one-liner and I don't know the impact on promex. I know Christopher was possibly thinking about a more advanced implementation but I thought it could be ok for a first version. Probably a good idea to wait for Christopher feedbacks anyway. I was indeed targeting a late v2.5, despite being a new thing. Sorry for that and I forgot to add a message about it. If it works well enough, I will also probably ask for a backport in 2.4 before the end of the year as I know large clusters are waiting for this patch to lower their memory consumption in prometheus. Thanks, Hi William, The change seems reasonable for a first implementation. I will add backports info in the commit message and merge it. Thanks ! -- Christopher Faulet
Re: [PATCH] MINOR: promex: backend aggregated server check status
On Mon, Nov 08, 2021 at 02:31:32PM +0100, William Dauchy wrote: > On Mon, Nov 8, 2021 at 1:52 PM Willy Tarreau wrote: > > Just to be sure, is this something you want to merge into 2.5 or is it > > to be queued next ? I'm fine with both, but I prefer to ask as it's not > > just a one-liner and I don't know the impact on promex. > > I know Christopher was possibly thinking about a more advanced > implementation but I thought it could be ok for a first version. > Probably a good idea to wait for Christopher feedbacks anyway. > I was indeed targeting a late v2.5, despite being a new thing. Sorry > for that and I forgot to add a message about it. > If it works well enough, I will also probably ask for a backport in > 2.4 before the end of the year as I know large clusters are waiting > for this patch to lower their memory consumption in prometheus. OK, thanks for the details. Then we'll just add a hint about the possibilty of a backport to the commit message, as keeping track of the original author's intent is often useful. Thanks, Willy
Re: [PATCH] MINOR: promex: backend aggregated server check status
On Mon, Nov 8, 2021 at 1:52 PM Willy Tarreau wrote: > Just to be sure, is this something you want to merge into 2.5 or is it > to be queued next ? I'm fine with both, but I prefer to ask as it's not > just a one-liner and I don't know the impact on promex. I know Christopher was possibly thinking about a more advanced implementation but I thought it could be ok for a first version. Probably a good idea to wait for Christopher feedbacks anyway. I was indeed targeting a late v2.5, despite being a new thing. Sorry for that and I forgot to add a message about it. If it works well enough, I will also probably ask for a backport in 2.4 before the end of the year as I know large clusters are waiting for this patch to lower their memory consumption in prometheus. Thanks, -- William
Re: [PATCH] MINOR: promex: backend aggregated server check status
Hi William, On Sun, Nov 07, 2021 at 10:18:47AM +0100, William Dauchy wrote: > - add new metric: `haproxy_backend_agg_server_check_status` > it counts the number of servers matching a specific check status > this permits to exclude per server check status as the usage is often > to rely on the total. Indeed in large setup having thousands of > servers per backend the memory impact is not neglible to store the per > server metric. > - realign promex_str_metrics array > > quite simple implementation - we could improve it later by adding an > internal state to the prometheus exporter, thus to avoid counting at > every dump. > > this patch is an attempt to close github issue #1312 Just to be sure, is this something you want to merge into 2.5 or is it to be queued next ? I'm fine with both, but I prefer to ask as it's not just a one-liner and I don't know the impact on promex. Willy
[PATCH] MINOR: promex: backend aggregated server check status
- add new metric: `haproxy_backend_agg_server_check_status` it counts the number of servers matching a specific check status this permits to exclude per server check status as the usage is often to rely on the total. Indeed in large setup having thousands of servers per backend the memory impact is not neglible to store the per server metric. - realign promex_str_metrics array quite simple implementation - we could improve it later by adding an internal state to the prometheus exporter, thus to avoid counting at every dump. this patch is an attempt to close github issue #1312 Signed-off-by: William Dauchy --- addons/promex/service-prometheus.c | 229 - include/haproxy/stats-t.h | 1 + src/stats.c| 4 + 3 files changed, 131 insertions(+), 103 deletions(-) diff --git a/addons/promex/service-prometheus.c b/addons/promex/service-prometheus.c index 221e40705..ef217007d 100644 --- a/addons/promex/service-prometheus.c +++ b/addons/promex/service-prometheus.c @@ -189,106 +189,107 @@ const struct promex_metric promex_global_metrics[INF_TOTAL_FIELDS] = { /* frontend/backend/server fields */ const struct promex_metric promex_st_metrics[ST_F_TOTAL_FIELDS] = { - //[ST_F_PXNAME] ignored - //[ST_F_SVNAME] ignored - [ST_F_QCUR] = { .n = IST("current_queue"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_QMAX] = { .n = IST("max_queue"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_SCUR] = { .n = IST("current_sessions"), .type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_SMAX] = { .n = IST("max_sessions"), .type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_SLIM] = { .n = IST("limit_sessions"), .type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_STOT] = { .n = IST("sessions_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_BIN]= { .n = IST("bytes_in_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_BOUT] = { .n = IST("bytes_out_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_DREQ] = { .n = IST("requests_denied_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC ) }, - [ST_F_DRESP] = { .n = IST("responses_denied_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_EREQ] = { .n = IST("request_errors_total"), .type = PROMEX_MT_COUNTER, .flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC ) }, - [ST_F_ECON] = { .n = IST("connection_errors_total"), .type = PROMEX_MT_COUNTER, .flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_ERESP] = { .n = IST("response_errors_total"), .type = PROMEX_MT_COUNTER, .flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_WRETR] = { .n = IST("retry_warnings_total"), .type = PROMEX_MT_COUNTER, .flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_WREDIS] = { .n = IST("redispatch_warnings_total"), .type = PROMEX_MT_COUNTER, .flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_STATUS] = { .n = IST("status"), .type = PROMEX_MT_GAUGE,.flags = (PROMEX_FL_FRONT_METRIC | PROMEX_FL_LI_METRIC | PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRIC) }, - [ST_F_WEIGHT] = { .n = IST("weight"), .type = PROMEX_MT_GAUGE,.flags = ( PROMEX_FL_BACK_METRIC | PROMEX_FL_SRV_METRI