[prometheus-users] Federation of aggregated metrics

hannesst...@gmail.com Tue, 31 Oct 2023 09:54:11 -0700

*Is it possible to include an aggregation formula in a federation query?*
This would avoid creating and storing aggregated metrics on the lower-level 
Prometheus, and also remove the delay between aggregation and federation.

*Or, can a recording rule fetch metrics directly from another Prometheus
server?*
This would avoid the delay between federation and execution of the
recording rule.

We've got hundreds of pods, delivering millions of metrics. We plan to
partition our pods and deploy one Prometheus per partition. A top-level
Prometheus will then offer globally aggregated metrics.

How should we set this up? My current assumption is based on aggregation
recording rules within each partition, then again at the top-level to get
the global aggregation. This seems both complicated and a waste of
resources, plus also introduces delays since recording rules and federation
cannot be synced to each other.

To minimize delay, each partition-level Prometheus needs to aggregate as
often as possible to offer "fresh" metrics to federation requests. Then the
top level Prometheus also needs to federate these aggregated metrics as
often as possible to offer "fresh" values at the global level. Doing things
as often as possible "just in case" seems wasteful, which is why I am
asking if this is the right approach for us.

--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/26a1f50d-e21d-4d1b-9875-79d03ad3d2d0n%40googlegroups.com.

[prometheus-users] Federation of aggregated metrics

Reply via email to