On 09/06/2021 07:16, nina guo wrote:
Thank you very much.
May I ask if there is a way to make multiple Prometheus instances to
scrape different targets?
Compared the 2 solution, scraping same targerts vs scraping different
targets, which is more better?
On Monday, June 7, 2021 at 6:08:47 PM UTC+8 Stuart Clark wrote:
When doing autoscaling (not specifically with Prometheus but
everything) you need to ensure that you don't have too many
changes happening at once, otherwise you might start rejecting
requests (if all instances are restarting at the same time).
This would generally be done via things like pod distuption
budgets. For a pair of Prometheus servers I'd not want more than
one change at once. For other systems I might go as far as N-1
changes at once.
Yes sharding is a standard solution when wanting to scale Prometheus
performance.
The two options are for different use cases and work together. A single
Prometheus server can handle a certain number of targets/queries based
on both the number of metrics being scraped and the CPU/memory assigned
to that server. Above that level you would look to split your list of
targets across multiple servers. Also it might make sense to do that
splitting also for organisational reasons - different servers split by
product, service, location, etc. which are managed by different teams
for example. So you might have a server in location X and two servers in
location Y (product A and product B).
You might have additional more central servers for global alerts using
federation or a system such as Thanos not to combine all metrics
together (which would be a single point of failure and require massive
resources) but to allow for a consolidated view.
Alongside this you would use pairs of Prometheus for HA, so that if a
single server isn't operating (failure, maintenance, etc.) you don't
lose metrics. You might run a system such as promxy or Thanos in front
of each pair to handle deduplication. So in the example of 3 groups of
Prometheus servers (X, AY & BY) they would actually be HA pairs, so 6
servers in total. If using a system such as Kubernetes you'd need to
ensure that any changes are limited (e.g. via pod disruption budgets) to
ensure the second pod isn't stopped/replaced while the first is out of
action.
--
Stuart Clark
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/4fbe457f-01a2-a86d-b79d-0e02813b9aa0%40Jahingo.com.