On 06/05/2020 20:21, Shay Berman wrote:
Hi Stuart

Agree with you points.

about this section:
/"The options which run queries on the "local" Prometheus servers require
those services to be available and not too busy - you can have the
situation that a query from somewhere else breaks a server because it is
too big/too slow. Equally a server being unavailable (down/network
issues) will cause a query to fail."/

You didn't mentioned promxy or Thanos query - these could help to avoid failing the whole query if one single prometheus instance does not responding.


It could help (or hinder) depending on the failure mode & query purpose.

If you are trying a query across multiple sharded servers (e.g. different environments) Thanos/promxy isn't going to help with the missing data. However if you have HA pairs of servers everywhere it can be very useful if a single server has issues.

If you have queries which stress a server (either due to amount of timeseries covered or just overall query volume) systems which duplicate queries could in certain situations make things worse - maybe every server is now overloaded.

As I say, the exact "best option" very much depends on your particular situation. Is it a single environment in one location, or lots of environments globally? Do you have a single easily defined set of users (dashboards/alerts) or lots of different teams with different needs & requirements (e.g. some needing longer term querying for capacity management, while others are just short term incident management)? Does the way you operate fit into a more hierarchical structure/process (e.g region -> environment -> service -> instance) or are things more "flat"?

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/2cba5da7-09bc-12ac-9e6e-c29511a2a5c7%40Jahingo.com.

Reply via email to