Hi,
Thanks Cedric and sorry for the delay. I have few comments.
Le 01/09/2023 à 08:22, Cedric Paillet a écrit :
The state of servers that were put in maintenance via the runtime API are
reported within the "backend_agg_server_check_status" metric, which
I guess you mean "backend_add_check_status". Because
"backend_agg_server_check_status" is a deprecated metric. It was replaced by
"backend_agg_server_status".
lead to inconsistent sums when compared to the "haproxy_server_check_status"
metric.
Now excluding them from this computation.
---
addons/promex/service-prometheus.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/addons/promex/service-prometheus.c
b/addons/promex/service-prometheus.c
index e02e8c0c7..7b61683dd 100644
--- a/addons/promex/service-prometheus.c
+++ b/addons/promex/service-prometheus.c
@@ -863,9 +863,12 @@ static int promex_dump_back_metrics(struct appctx *appctx,
struct htx *htx)
goto next_px;
sv = px->srv;
while (sv) {
+ if (sv->cur_admin &
SRV_ADMF_MAINT)
+ goto next_sv;
Here we should check for PROMEX_FL_NO_MAINT_SRV promex flag, no ? I mean:
if ((ctx->flags & PROMEX_FL_NO_MAINT_SRV) && (sv->cur_admin & SRV_ADMF_MAINT))
goto next_sv;
And I guess we should also check the healthchecks are enabled for the server. It
is not really an issue because call to get_check_status_result() will exclude
neutral and unknown satuses. But there is no reason to count these servers.
srv_check_status =
sv->check.status;
srv_check_count[srv_check_status] += 1;
- sv = sv->next;
+ next_sv:
+ sv = sv->next;
}
for (; ctx->obj_state < HCHK_STATUS_SIZE;
ctx->obj_state++) {
if
(get_check_status_result(ctx->obj_state) < CHK_RES_FAILED)
Otherwise, I'm ok with the fix.
--
Christopher Faulet