> Hi Pierre,

Hi!,

> I addressed this issue based on a William's idea. I also proposed to add a 
> filter to exclude all servers in maintenance from the export. Let me know if 
> you 
> see a better way to do so. For the moment, from the exporter point of view, 
> it 
> is not really hard to do such filtering.

Yes, that's a great addition, and should improve a lot, but I'm still not sure 
if it will be
sufficient (meaning that we could end up dropping all servers if the endpoint 
are still
too huge, as we used to do with the old exporter).

BTW, we also did that since we're using server-templates, and the naming in 
templates
make the server-name information useless (since we can't modify the name at 
runtime).
So we previously had a sufficient level of info at backend level, thanks to 
native
aggregations.

>> [ST_F_CHECK_STATUS]   = IST("untyped"),
>> What could be done to be able to retrieve them? (I thought about something 
>> similar to 
>> `HRSP_[1-5]XX`, where the different check status could be defined and 
>> counted).
>> 
>
> Hum, I can add the check status. Mapping all status on integers is possible. 
> However, having a metric per status is probably not the right solution, 
> because 
> it is not a counter but just a state (a boolean). If we do so, all status 
> would 
> be set to 0 except the current status. It is not really handy. But a mapping 
> is 
> possible. We already do this for the frontend/backend/server status 
> (ST_F_STATUS).

Yes, it would work perfectly. At the end the goal for us would be to be able to 
retrieve this
state. My idea about a counter was more about backend-level aggregations, if 
consistent
(I'm not sure it is actually, hence the feedback request).

>> * also for `check_status`, there is the case of L7STS and its associated 
>> values that are present
>> in another field. Most probably it could benefit from a better 
>> representation in a prometheus
>> output (thanks to labels)?
>>
> We can also export the metrics ST_F_CHECK_CODE. For the use of labels, I have 
> no 
> idea. For now, the labels are static in the exporter. And I don't know if it 
> is 
> pertinent to add dynamic info in labels. If so, what is your idea ? Add a 
> "code" 
> label associated to the check_status metric ?

Here again, my maybe-not-so-good idea was to keep the ability to retrieve all 
the
underlying details at backend level, such as:
* 100 servers are L7OK
* 1 server is L4TOUT
* 2 servers are L4CON
* 2 servers are L7STS
** 1 due to a HTTP 429
** 1 due to a HTTP 503

But this is maybe overkill in terms of complexity, we could maybe push more on
our ability to retrieve non-maint servers status.

> It is feasible. But only counters may be aggregated. It may be enabled using 
> a 
> parameter in the query-string. However, it is probably pertinent only when 
> the 
> server metrics are filtered out. Because otherwise, Prometheus can handle the 
> aggregation itself.

Sure, we should rely on this as much as possible.

--
Pierre

Reply via email to