Hi,

Thanks Cedric and sorry for the delay. I have few comments.

Le 01/09/2023 à 08:22, Cedric Paillet a écrit :
The state of servers that were put in maintenance via the runtime API are
reported within the "backend_agg_server_check_status" metric, which
I guess you mean "backend_add_check_status". Because "backend_agg_server_check_status" is a deprecated metric. It was replaced by "backend_agg_server_status".

lead to inconsistent sums when compared to the "haproxy_server_check_status"
metric.

Now excluding them from this computation.
---
  addons/promex/service-prometheus.c | 5 ++++-
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/addons/promex/service-prometheus.c 
b/addons/promex/service-prometheus.c
index e02e8c0c7..7b61683dd 100644
--- a/addons/promex/service-prometheus.c
+++ b/addons/promex/service-prometheus.c
@@ -863,9 +863,12 @@ static int promex_dump_back_metrics(struct appctx *appctx, 
struct htx *htx)
                                                goto next_px;
                                        sv = px->srv;
                                        while (sv) {
+                                               if (sv->cur_admin & 
SRV_ADMF_MAINT)
+                                                       goto next_sv;

Here we should check for PROMEX_FL_NO_MAINT_SRV promex flag, no ? I mean:

  if ((ctx->flags & PROMEX_FL_NO_MAINT_SRV) && (sv->cur_admin & SRV_ADMF_MAINT))
      goto next_sv;

And I guess we should also check the healthchecks are enabled for the server. It is not really an issue because call to get_check_status_result() will exclude neutral and unknown satuses. But there is no reason to count these servers.

                                                srv_check_status = 
sv->check.status;
                                                
srv_check_count[srv_check_status] += 1;
-                                               sv = sv->next;
+                                               next_sv:
+                                                       sv = sv->next;
                                        }
                                        for (; ctx->obj_state < HCHK_STATUS_SIZE; 
ctx->obj_state++) {
                                                if 
(get_check_status_result(ctx->obj_state) < CHK_RES_FAILED)

Otherwise, I'm ok with the fix.

--
Christopher Faulet


Reply via email to