*Is it possible to include an aggregation formula in a federation query?* This would avoid creating and storing aggregated metrics on the lower-level Prometheus, and also remove the delay between aggregation and federation.
*Or, can a recording rule fetch metrics directly from another Prometheus server?* This would avoid the delay between federation and execution of the recording rule. We've got hundreds of pods, delivering millions of metrics. We plan to partition our pods and deploy one Prometheus per partition. A top-level Prometheus will then offer globally aggregated metrics. How should we set this up? My current assumption is based on aggregation recording rules within each partition, then again at the top-level to get the global aggregation. This seems both complicated and a waste of resources, plus also introduces delays since recording rules and federation cannot be synced to each other. To minimize delay, each partition-level Prometheus needs to aggregate as often as possible to offer "fresh" metrics to federation requests. Then the top level Prometheus also needs to federate these aggregated metrics as often as possible to offer "fresh" values at the global level. Doing things as often as possible "just in case" seems wasteful, which is why I am asking if this is the right approach for us. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/26a1f50d-e21d-4d1b-9875-79d03ad3d2d0n%40googlegroups.com.