Re: [prometheus-users] Large cluster simulation

2021-09-23 Thread sayf.eddi...@gmail.com
thanks I ll take a look On Thursday, September 23, 2021 at 9:47:08 AM UTC+2 sup...@gmail.com wrote: > Take a look at https://github.com/prometheus/test-infra > > This is what we use to benchmark changes and each release. > > On Thu, Sep 23, 2021 at 9:28 AM sayf.eddi...@gmail.com < > sayf.eddi...@

Re: [prometheus-users] PromQL to identify targets which are failing continuously in last 5 minutes

2021-09-23 Thread Ben Kochie
avg_over_time(probe_success[5m]) == 0 On Fri, Sep 24, 2021 at 4:42 AM Narendra Gudipudi wrote: > Hi, > > I am using blackbox exporter to probe my targets. I am trying to write a > PromQL to identify and alert every 5 minutes with details of targets which > are continuously failing in the last 5

[prometheus-users] PromQL to identify targets which are failing continuously in last 5 minutes

2021-09-23 Thread Narendra Gudipudi
Hi, I am using blackbox exporter to probe my targets. I am trying to write a PromQL to identify and alert every 5 minutes with details of targets which are continuously failing in the last 5 minutes. Any help here is greatly appreciated. I run my probers every 1 minute; *9:00 AM* probe_success

[prometheus-users] prometheus-operator with custom flags

2021-09-23 Thread Barrow Kwan
I wonder if we can pass custom flags in prometheus-operator? eg I want to pass `--storage.local.dirty` to fix a corrupted WAL thanks -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails

Re: [prometheus-users] Re: Large portions of prometheus TSDB data getting corrupted and deleted

2021-09-23 Thread Jiacai Liu
Wal repair logic is here: https://github.com/prometheus/prometheus/blob/v2.22.1/tsdb/wal/wal.go#L339. The func comment says it will discards all data after corruption. According to your log at 09-10, the wal at that time has high probility of getting corrupted, so prometheus will delete al

Re: [prometheus-users] Re: Large portions of prometheus TSDB data getting corrupted and deleted

2021-09-23 Thread Ben Kochie
There have been several WAL corruption fixes since that version. On Thu, Sep 23, 2021 at 5:23 PM Brandon Duffany wrote: > Yep -- > > Prometheus: 2.22.1 -- revision 00f16d1ac3a4c94561e5133b821d8e4d9ef78ec2 > > Filesystem: ext4 > > > On Thursday, September 23, 2021 at 12:18:21 AM UTC-4 Julien Pivo

Re: [prometheus-users] Re: Large portions of prometheus TSDB data getting corrupted and deleted

2021-09-23 Thread Brandon Duffany
Yep -- Prometheus: 2.22.1 -- revision 00f16d1ac3a4c94561e5133b821d8e4d9ef78ec2 Filesystem: ext4 On Thursday, September 23, 2021 at 12:18:21 AM UTC-4 Julien Pivotto wrote: > Can we know the filesystem you use and your Prometheus version? > > Le jeu. 23 sept. 2021 à 06:06, Brandon Duffany a >

[prometheus-users] Re: can JMX exporter provide http server request statistics?

2021-09-23 Thread Ponson
Hello Friends, can someone please help clarifying this? On Wednesday, September 22, 2021 at 6:11:28 PM UTC+5:30 Ponson wrote: > Hello Experts, >We are using jmx exporter to export JVM stats for our java application. > We recently had a requirement to collect http server request statistics an

Re: [prometheus-users] re: basic_auth: fails in prometheus.yml

2021-09-23 Thread Stuart Clark
On 23/09/2021 14:00, Spyros Maziotis wrote: Cannot add these statements in prometheus.yml basic_auth:         username: xxx         password: yyy Prometheus  ( ver. 2.29.2 ) fails tto start Error: yaml:unmarshal errors: field username not found in type config.ScropeConfig Same for p

[prometheus-users] re: basic_auth: fails in prometheus.yml

2021-09-23 Thread Spyros Maziotis
Cannot add these statements in prometheus.yml basic_auth: username: xxx password: yyy Prometheus ( ver. 2.29.2 ) fails tto start Error: yaml:unmarshal errors: field username not found in type config.ScropeConfig Same for password Can anyone enlight me on this ? Best

Re: [prometheus-users] Re: Single Prometheus for Large Cluster

2021-09-23 Thread Brian Candler
Dropping individual labels isn't likely to make a huge difference, if you're still scraping the same set of timeseries. The bag of labels is just what distinguishes one timeseries from another. It does have to be kept in memory, but it's static and doesn't use much RAM. Dropping labels might e

Re: [prometheus-users] Re: Single Prometheus for Large Cluster

2021-09-23 Thread patricia lee
Thanks for the information. For the meantime,we are trying to drop the high memory usage label in our prometheus, so we dropped the ID - (test environment) However, even if we dropped the labels on all jobs, the memory usage is still at 5Gi (which is the same). Will the drop in memory usage of Pro

[prometheus-users] Cluster mode sending duplicate alerts

2021-09-23 Thread 'Hugo Dias' via Prometheus Users
I have a monitoring system using 2 EC2 instances, each running prom/prometheus:v2.30.0 and prom/alertmanager:v0.23.0 and both of the prometheus instances are scrapping the same metrics sending the alerts to both alertmanagers: # Alertmanager configuration alerting: alertmanagers: - static

Re: [prometheus-users] Large cluster simulation

2021-09-23 Thread Ben Kochie
Take a look at https://github.com/prometheus/test-infra This is what we use to benchmark changes and each release. On Thu, Sep 23, 2021 at 9:28 AM sayf.eddi...@gmail.com < sayf.eddine.hamm...@gmail.com> wrote: > Hello, > I want to test the behavior of my Prometheus setup (HA, and/or federation)

[prometheus-users] Large cluster simulation

2021-09-23 Thread sayf.eddi...@gmail.com
Hello, I want to test the behavior of my Prometheus setup (HA, and/or federation) on large setup (resource consumption, possible crashes, latencies etc). Are there any tool available for that? I am thinking about using a small number of servers but configure them multiple times in Prometheus with