subject:"RE\: \[PATCH 1\/9\] MAJOR\: contrib\/prometheus\-exporter\: move health check status to labels"

Re: [PATCH 1/9] MAJOR: contrib/prometheus-exporter: move health check status to labels

2021-02-01 Thread William Dauchy

On Mon, Feb 1, 2021 at 12:05 PM Pierre Cheynier  wrote:
> In addition to this update, I would add some recommendations about the user
> setup in the README ("how do I prevent my prometheus instance to explode
> when scrapping this endpoint?").
> For server-template users:
> - 
>   params:
> no-maint:
> - empty
>
> Generally speaking, to prevent all server metrics to be saved, except this 
> one:
> - 
>metric_relabel_configs:
>- source_labels: ['__name__']
>   regex: 'haproxy_(process_|frontend_|backend_|server_check_status).*'
>   action: keep

agreed taken into account in v2

> "Status of last health check, per state value" ?

updated in v2.
-- 
William

RE: [PATCH 1/9] MAJOR: contrib/prometheus-exporter: move health check status to labels

2021-02-01 Thread Pierre Cheynier


De : William Dauchy 
Envoyé : samedi 30 janvier 2021 16:21

> this is a follow up of commit c6464591a365bfcf509b322bdaa4d608c9395d75
> ("MAJOR: contrib/prometheus-exporter: move ftd/bkd/srv states to
> labels"). The main goal being to be better aligned with prometheus use
> cases in terms of queries. More specifically to health checks, Pierre C.
> mentioned the possible quirks he had to put in place in order to make
> use of those metrics through prometheus:
> 
>    by(proxy, check_status) (count_values by(proxy,
>   instance) ("check_status", haproxy_server_check_status))

Indeed, we wanted to aggregate at backend level to produce a view which
represent the "number of servers per health status". Still, this was not ideal
since I'm getting the status code and not the status name.

> I am perfectly aware this introduces a lot more metrics but I don't see
> how we can improve the usability without it. The main issue remains in
> the cardinality of the states which are > 20. Prometheus recommends to
> stay below a cardinality of 10 for a given metric but I consider our
> case very specific, because highly linked to the level of precision
> haproxy exposes.
> 
> Even before this patch I saw several large production setup (a few
> hundreds of MB in output) which are making use of the scope parameter to
> simply ignore the server metrics, so that the scrapping can be faster,
> and memory consumed on client side not too high. So I believe we should
> eventually continue in that direction and offer more granularity of
> filtering of the output. That being said it is already possible to
> filter out the data on prometheus client side.

True, I think this is the right approach to represent such data in Prometheus.
It will create some challenges for people like us, who use server-templates
and create thousands of backends, but we'll have to deal with it.
In addition to this update, I would add some recommendations about the user
setup in the README ("how do I prevent my prometheus instance to explode
when scrapping this endpoint?").
For server-template users:
- 
  params:
no-maint:
- empty

Generally speaking, to prevent all server metrics to be saved, except this one:
- 
   metric_relabel_configs:
   - source_labels: ['__name__']
  regex: 'haproxy_(process_|frontend_|backend_|server_check_status).*'
  action: keep

A long-term alternative (at least for my use-case) would be to provide this 
data at
backend-level, as initially suggested here:
https://www.mail-archive.com/haproxy@formilux.org/msg35369.html

> Signed-off-by: William Dauchy 
> ---

> @@ -319,7 +320,7 @@ const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] 
> = {
>  [ST_F_RATE]   = IST("Current number of sessions per second 
> over last elapsed second."),
>  [ST_F_RATE_LIM]   = IST("Configured limit on new sessions per 
> second."),
>  [ST_F_RATE_MAX]   = IST("Maximum observed number of sessions per 
> second."),
> -   [ST_F_CHECK_STATUS]   = IST("Status of last health check 
> (HCHK_STATUS_* values)."),
> +   [ST_F_CHECK_STATUS]   = IST("Status of last health check (0/1 
> depending on current `state` label value)."),

"Status of last health check, per state value" ?

--
Pierre

Re: [PATCH 1/9] MAJOR: contrib/prometheus-exporter: move health check status to labels

RE: [PATCH 1/9] MAJOR: contrib/prometheus-exporter: move health check status to labels

2 matches

Site Navigation

Mail list logo

Footer information