Hi Pierre, Sorry I missed you email. Thanks to William for the reminder.
Le 15/11/2019 à 15:55, Pierre Cheynier a écrit :
We've recently tried to switch to the native prometheus exporter, but went quickly stopped in our initiative given the output on one of our preprod server: $ wc -l metrics.out 1478543 metrics.out $ ls -lh metrics.out -rw-r--r-- 1 pierre pierre 130M nov. 15 15:33 metrics.out This is not only due to a large setup, but essentially related to server lines, since we extensively user server-templates for server addition/deletion at runtime. # backend & servers number $ echo "show stat -1 2 -1" | sudo socat stdio /var/lib/haproxy/stats | wc -l 1309 $ echo "show stat -1 4 -1" | sudo socat stdio /var/lib/haproxy/stats | wc -l 36360 # But a lot of them are actually "waiting to be provisioned" (especially on this preprod environment) $ echo "show stat -1 4 -1" | sudo socat stdio /var/lib/haproxy/stats | grep MAINT | wc -l 34113 We'll filter out the server metrics as a quick fix, and will hopefully submit something to do it natively, but we would also like to get your feedbacks about some use-cases we expected to solve with this native exporter.
I addressed this issue based on a William's idea. I also proposed to add a filter to exclude all servers in maintenance from the export. Let me know if you see a better way to do so. For the moment, from the exporter point of view, it is not really hard to do such filtering.
Ultimately, one of them would be a great value-added for us: being able to count check_status types (and their values in the L7STS case) per backend. So, there are 3 associated points: * it's great to have new metrics (such as `haproxy_process_current_zlib_memory`), but we also noticed that some very useful ones were not present due to their type, example: [ST_F_CHECK_STATUS] = IST("untyped"), What could be done to be able to retrieve them? (I thought about something similar to `HRSP_[1-5]XX`, where the different check status could be defined and counted).
Hum, I can add the check status. Mapping all status on integers is possible. However, having a metric per status is probably not the right solution, because it is not a counter but just a state (a boolean). If we do so, all status would be set to 0 except the current status. It is not really handy. But a mapping is possible. We already do this for the frontend/backend/server status (ST_F_STATUS).
* also for `check_status`, there is the case of L7STS and its associated values that are present in another field. Most probably it could benefit from a better representation in a prometheus output (thanks to labels)?
We can also export the metrics ST_F_CHECK_CODE. For the use of labels, I have no idea. For now, the labels are static in the exporter. And I don't know if it is pertinent to add dynamic info in labels. If so, what is your idea ? Add a "code" label associated to the check_status metric ?
* what about getting some backend-level aggregation of server metrics, such as the one that was previously mentioned, to avoid retrieving all the server metrics but still be able to get some insights? I'm thinking about an aggregation of some fields at backend level, which was not previously done with the CSV output.
It is feasible. But only counters may be aggregated. It may be enabled using a parameter in the query-string. However, it is probably pertinent only when the server metrics are filtered out. Because otherwise, Prometheus can handle the aggregation itself.
-- Christopher Faulet