[prometheus-users] Scaling Prometheus

2020-10-10 Thread kvr
Hello, We are hitting some limits with our current setup of Prometheus. I have read a lot of posts here as well as blogs and videos but still need some guidance. Our current setup is at it's limit. Head series count is around 15M during pod churn regularly. Each app exports between 5000 and 80

Re: [prometheus-users] snmp_exporter and lldp

2020-10-10 Thread Ben Kochie
It seems to work for me. Have you checked for parse errors? https://github.com/SuperQ/tools/tree/master/snmp_exporter/lldp On Fri, Oct 9, 2020 at 2:42 PM Gilbert Moisio wrote: > Hi all, trying to generate an *snmp.yml* file walking from *generator.yml* > into *lldpRemTable* of *LLDP-MIB*, th

Re: [prometheus-users] Prometheus HA Setup.

2020-10-10 Thread Ben Kochie
4.6TB for 50 days seems like a lot. How many metrics and how many samples per second are you collecting? Just estimating based on the data, it sounds like you might have more than 10 million series and 600-700 samples per second. This might be the time to start thinking about sharding. You can che

Re: [prometheus-users] Re: Prometheus HA Setup.

2020-10-10 Thread Stuart Clark
On 10/10/2020 08:43, yagyans...@gmail.com wrote: d) If we do use 2 separate disks for the 2 instances, how will we manage the config files? I mean is there any way to make the changes on any one instance and those get replicated to other instance automatically or will we have to do that manuall

[prometheus-users] Re: Prometheus HA Setup.

2020-10-10 Thread yagyans...@gmail.com
d) If we do use 2 separate disks for the 2 instances, how will we manage the config files? I mean is there any way to make the changes on any one instance and those get replicated to other instance automatically or will we have to do that manually? On Saturday, October 10, 2020 at 12:36:25 PM U

[prometheus-users] Inhibit not work when two related alert first arrived at same time

2020-10-10 Thread Allenzh li
Hi, I have use cortex which use prometheus and alertmanager as infrastructure in my environment, when i test inhibit find a some time it is not work as expect. In a case, I run cortex and set two rules: 1. go_threads > 0 severity 4 2. go_threads > 10 severity 1 My inhibit rules: - source_mat

[prometheus-users] Prometheus HA Setup.

2020-10-10 Thread yagyans...@gmail.com
Hi. I have a vanilla Prometheus setup with 50 days retention and data size of around 4.6TB for this much retention. I want to move to HA set up to avoid a single point of failure. I'm a little confused on how to approach the below points: a) With a HA pair, does the Prometheus data necessarily b