[prometheus-users] Alert manager email not getting triggered

2020-11-25 Thread Bharathwaj Shankar
The alert is getting triggered to alert manager but I am not receiving email. The configuration I use is as below, global: smtp_require_tls: false route: group_by: ['instance', 'severity'] group_wait: 30s group_interval: 60s repeat_interval: 60s receiver: team-1 receivers: - name

Re: [prometheus-users] Alert manager email not getting triggered

2020-11-25 Thread Mohd Zakir
Hello Bharathwaj, Can you please try with changing the from email address. Thank you, Regards, Mohd Zakir. On Wed, Nov 25, 2020 at 1:41 PM Bharathwaj Shankar < saialumnibharath...@gmail.com> wrote: > The alert is getting triggered to alert manager but I am not receiving > email. > > > The confi

Re: [prometheus-users] Issue in start promentheus service

2020-11-25 Thread 'Kunal Khandelwal' via Prometheus Users
Hi Team, I am facing an issue while starting Prometheus Service in Ubuntu., it's throwing the following: systemctl status prometheus ● prometheus.service - Prometheus Loaded: loaded (/etc/systemd/system/prometheus.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code)

Re: [prometheus-users] Alert manager email not getting triggered

2020-11-25 Thread Bharathwaj Shankar
I have tried with that as well. it is not working On Wednesday, November 25, 2020 at 1:48:10 PM UTC+5:30 mohd wrote: > Hello Bharathwaj, > > Can you please try with changing the from email address. > > Thank you, > Regards, > Mohd Zakir. > > On Wed, Nov 25, 2020 at 1:41 PM Bharathwaj Shankar >

Re: [prometheus-users] Issue in start promentheus service

2020-11-25 Thread Christian Hoffmann
Hi, On 11/25/20 9:19 AM, 'Kunal Khandelwal' via Prometheus Users wrote: I am facing an issue while starting Prometheus Service in Ubuntu., it's throwing the following: In the future, could you please start a new thread? The way you posted makes it appear that your issue is somehow related to th

Re: [prometheus-users] Alert manager email not getting triggered

2020-11-25 Thread Bharathwaj Shankar
Please help me with the working sample of alertmanager.yml file to edit my email and config to check On Wednesday, November 25, 2020 at 1:58:33 PM UTC+5:30 Bharathwaj Shankar wrote: > I have tried with that as well. it is not working > > On Wednesday, November 25, 2020 at 1:48:10 PM UTC+5:30 mo

Re: [prometheus-users] Issue in start promentheus service

2020-11-25 Thread 'Kunal Khandelwal' via Prometheus Users
Hi Christian, No, I didn't miss "s" that's cut, In my service file it has a proper name. Well, here is my service file, can you tell me what I missed? [Unit] Description=Prometheus #Documentation=https://prometheus.io/docs/introduction/overview/ Wants=network-online.target After=network-online.t

Re: [prometheus-users] Alert manager email not getting triggered

2020-11-25 Thread Bharathwaj Shankar
have tried like this and it is showing status as firing only. route: repeat_interval: 1d group_interval: 1d group_by: [Alertname] # Send all notifications to me. receiver: email-me receivers: - name: email-me email_configs: - to: sssbgmschoolanalyt...@gmail.com from: saialumnib

Re: [prometheus-users] Issue in start promentheus service

2020-11-25 Thread Christian Hoffmann
Hi, On 11/25/20 9:40 AM, 'Kunal Khandelwal' via Prometheus Users wrote: No, I didn't miss "s" that's cut, In my service file it has a proper name. Ah, ok. Well, here is my service file, can you tell me what I missed? I don't spot any obvious problems at the first glance. However, I do notice

Re: [prometheus-users] Issue in start promentheus service

2020-11-25 Thread 'Kunal Khandelwal' via Prometheus Users
Hi, Please find the following output: root@ARL-KUNAL:/home/kunal/Documents/Prometheus/prometheus-2.22.2.linux-amd64# systemctl cat prometheus.service # Warning: prometheus.service changed on disk, the version systemd has loaded is outdated. # This output shows the current version of the unit's or

Re: [prometheus-users] Debugging OOM issue.

2020-11-25 Thread Yagyansh S. Kumar
Cool, thanks for the quick help. On Wed, Nov 25, 2020 at 1:18 PM Ben Kochie wrote: > No, concurrency only affects how many queries are running at the same > time. > > On Wed, Nov 25, 2020 at 8:45 AM Yagyansh S. Kumar < > yagyanshsku...@gmail.com> wrote: > >> Thanks, Ben. Was thinking of doing th

Re: [prometheus-users] Issue in start promentheus service

2020-11-25 Thread Christian Hoffmann
On 11/25/20 9:58 AM, 'Kunal Khandelwal' via Prometheus Users wrote: root@ARL-KUNAL:/home/kunal/Documents/Prometheus/prometheus-2.22.2.linux-amd64# systemctl cat prometheus.service # Warning: prometheus.service changed on disk, the version systemd has loaded is outdated. # This output shows the c

Re: [prometheus-users] Alert manager email not getting triggered

2020-11-25 Thread Bharathwaj Shankar
still facing the same issue. any other idea?? On Wednesday, November 25, 2020 at 2:15:31 PM UTC+5:30 Bharathwaj Shankar wrote: > have tried like this and it is showing status as firing only. > > > > route: > repeat_interval: 1d > group_interval: 1d > group_by: [Alertname] > # Send all no

Re: [prometheus-users] Alert manager email not getting triggered

2020-11-25 Thread Bharathwaj Shankar
do you have any demo mail server to check On Wednesday, November 25, 2020 at 3:07:56 PM UTC+5:30 Bharathwaj Shankar wrote: > still facing the same issue. any other idea?? > > On Wednesday, November 25, 2020 at 2:15:31 PM UTC+5:30 Bharathwaj Shankar > wrote: > >> have tried like this and it is s

[prometheus-users] Alert goes to Firing --> Resolved --> Firing immediately.

2020-11-25 Thread yagyans...@gmail.com
Hi. I am using Alertmanager 0.21.0. Occasionally, the active alerts go to resolved state for a second and then come back to firing state immediately. There is no pattern of this happening, it happens randomly. Haven't been able to identify why this is happening. Any thoughts here? Where should

Re: [prometheus-users] Alert goes to Firing --> Resolved --> Firing immediately.

2020-11-25 Thread Matthias Rampke
This could be many things, likely it has to do with the formulation of the alert. What does it look like in Prometheus? Specifically - the ALERTS metric shows what is pending or firing over time - evaluate the alert expression in Prometheus for the given time period. Are there gaps or does e.g. a

[prometheus-users] Different databases for different targets

2020-11-25 Thread Guna Kambalimath
Hey there, Question1: How does prometheus internally store the metrics ? Is it possible to view the same (not in prometheus UI). *Just like we can view influx Time Series Data Base, is it possible to view the Prometheus DB ? Does prometheus have any structure of storing data like, databases and

Re: [prometheus-users] Alert goes to Firing --> Resolved --> Firing immediately.

2020-11-25 Thread yagyans...@gmail.com
The alert formation doesn't seem to be a problem here, because it happens for different alerts randomly. Below is the alert for Exporter being down for which it has happened thrice today. - alert: ExporterDown expr: up == 0 for: 10m labels: severity: "CRITICAL" annotation

Re: [prometheus-users] Different databases for different targets

2020-11-25 Thread Stuart Clark
Prometheus stores data in a time series database, which is designed specifically for the needs of the application. That data is stored on disk in a variety of files and directories, including a Write Ahead Log (WAL) and set of blocks. Running PromQL queries is how you can see what is stored in

Re: [prometheus-users] Alert manager email not getting triggered

2020-11-25 Thread Bharathwaj Shankar
the issue is resolved On Wednesday, November 25, 2020 at 3:07:56 PM UTC+5:30 Bharathwaj Shankar wrote: > still facing the same issue. any other idea?? > > On Wednesday, November 25, 2020 at 2:15:31 PM UTC+5:30 Bharathwaj Shankar > wrote: > >> have tried like this and it is showing status as fir

Re: [prometheus-users] Alert goes to Firing --> Resolved --> Firing immediately.

2020-11-25 Thread Stuart Clark
On 25/11/2020 11:46, yagyans...@gmail.com wrote: The alert formation doesn't seem to be a problem here, because it happens for different alerts randomly. Below is the alert for Exporter being down for which it has happened thrice today.   - alert: ExporterDown     expr: up == 0     for: 10m   

[prometheus-users] Re: Blackbox reporting “Resolution with IP protocol failed” error after running fine for few hours.

2020-11-25 Thread Chris Paulraj
Tried with different build to include network tools, unable to figure out why the lookup fails. Tried with a blackbox-exporter image from docker hub, resulting with the same issue, although it lasted for 8 hours without error. It does look like this is an environmental issue with my setup, woul

[prometheus-users] jvm_memory_bytes_used{service="x", area="heap"} vs java_lang_memory_heapmemoryusage_used{service="x"}

2020-11-25 Thread Tigran
Hello, I am running a k8s cluster with 2 instances, and I noticed that I have a lot of difference between this two metrics, I would like to undertand why they return a different value. *metrics from jvm_memory_bytes_used{service="x", area="heap"} (instance 1 and 2)* *java_lang_memory_heapmem

Re: [prometheus-users] jvm_memory_bytes_used{service="x", area="heap"} vs java_lang_memory_heapmemoryusage_used{service="x"}

2020-11-25 Thread Brian Brazil
On Wed, 25 Nov 2020 at 13:35, Tigran wrote: > Hello, > > I am running a k8s cluster with 2 instances, and I noticed that I have a > lot of difference between this two metrics, I would like to undertand why > they return a different value. > > *metrics from jvm_memory_bytes_used{service="x", area=

Re: [prometheus-users] Alert goes to Firing --> Resolved --> Firing immediately.

2020-11-25 Thread Yagyansh S. Kumar
Hi Stuart. On Wed, 25 Nov, 2020, 6:56 pm Stuart Clark, wrote: > On 25/11/2020 11:46, yagyans...@gmail.com wrote: > > The alert formation doesn't seem to be a problem here, because it > > happens for different alerts randomly. Below is the alert for Exporter > > being down for which it has happen

[prometheus-users] Register Caffeine Cache Collector mBean

2020-11-25 Thread Bernardo
Hi, Just added the Caffeine CacheMetricsCollector by following the docs: https://github.com/prometheus/client_java/tree/11408239035f02a125fe3c860f05fcd0be1e7873#caches However when I start my application and attach VisualVM to it I can't see the mBeans for the caffeine cache: [image: image.png]

Re: [prometheus-users] Alert goes to Firing --> Resolved --> Firing immediately.

2020-11-25 Thread Stuart Clark
How many Alertmanager instances are there? Can they talk to each other and is Prometheus configured and able to push alerts to them all? On 25 November 2020 14:07:41 GMT, "Yagyansh S. Kumar" wrote: >Hi Stuart. > >On Wed, 25 Nov, 2020, 6:56 pm Stuart Clark, >wrote: > >> On 25/11/2020 11:46, ya

Re: [prometheus-users] Alert goes to Firing --> Resolved --> Firing immediately.

2020-11-25 Thread Yagyansh S. Kumar
On Wed, 25 Nov, 2020, 8:26 pm Stuart Clark, wrote: > How many Alertmanager instances are there? Can they talk to each other and > is Prometheus configured and able to push alerts to them all? > >> Single instance as of now. I did setup a Alertmanager Mesh of 2 Alertmanagers but I am facing duplic

[prometheus-users] Error while pulling docker image

2020-11-25 Thread Arindam Datta
I am getting 500: Internal server error while pulling the docker image quay.io/prometheus/prometheus:v2.22.1. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email t

Re: [prometheus-users] Error while pulling docker image

2020-11-25 Thread Julien Pivotto
On 25 Nov 07:19, Arindam Datta wrote: > I am getting 500: Internal server error while pulling the docker > image quay.io/prometheus/prometheus:v2.22.1. Quay is having issues at the moment according to their website: https://status.quay.io/ As a workaround you could docker pull prom/prometheus:

Re: [prometheus-users] Alert goes to Firing --> Resolved --> Firing immediately.

2020-11-25 Thread Stuart Clark
Is the second instance still running? If you are having some cluster communications issues that could result in what you are seeing. Both instances learn of an alert but then one instance missed some of the renewal messages, so resolves it. Then it gets updated and the alert is fired again. If

Re: [prometheus-users] Prometheus using AWS Timestream

2020-11-25 Thread Ryan Booz
As the makers of Promscale, we're very attuned to the needs of effective Prometheus deployments. With that in mind, one thing to consider with Timestream is that ingest performance from a single client seems to be a current limitation. The creator of this adaptor doesn't mention his his setup o

Re: [prometheus-users] Alert goes to Firing --> Resolved --> Firing immediately.

2020-11-25 Thread Yagyansh S. Kumar
On Wed, 25 Nov, 2020, 9:34 pm Stuart Clark, wrote: > Is the second instance still running? > > If you are having some cluster communications issues that could result in > what you are seeing. Both instances learn of an alert but then one instance > missed some of the renewal messages, so resolves

Re: [prometheus-users] Prometheus using AWS Timestream

2020-11-25 Thread Stuart Clark
On 25/11/2020 16:27, Ryan Booz wrote: As the makers of Promscale, we're very attuned to the needs of effective Prometheus deployments. With that in mind, one thing to consider with Timestream is that ingest performance from a single client seems to be a current limitation. The creator of this a

Re: [prometheus-users] Alert goes to Firing --> Resolved --> Firing immediately.

2020-11-25 Thread Stuart Clark
On 25/11/2020 16:27, Yagyansh S. Kumar wrote: On Wed, 25 Nov, 2020, 9:34 pm Stuart Clark, > wrote: Is the second instance still running? If you are having some cluster communications issues that could result in what you are seeing. Both instances

Re: [prometheus-users] Prometheus using AWS Timestream

2020-11-25 Thread Ryan Booz
I can't speak to CrateDB's tests, but the article I linked to said it took them 3 days to load ~3 billion metrics using 20 clients, which is on-par with our findings too. For our tests, we used TSBS and used the "cpu-only" use case to simulate 100 hosts. That test creates 1,000 time-series acro

Re: [prometheus-users] Prometheus using AWS Timestream

2020-11-25 Thread Ryan Booz
Sorry - just realized I mistyped that second sentence in the midst of trying to spell it out and crossed terminology. It should have read: For our tests, we used TSBS and used the "cpu-only" use case to simulate 100 hosts. That test creates 10 CPU time-series for 100 hosts, every 10 seconds -

[prometheus-users] Unknown series references

2020-11-25 Thread alexb...@gmail.com
Prometheus version: v2.15.2 Problem:the size of prometheus's wal is 42G,which will lead to OOM,and the prometheus can't restart,unless to remove the wal manually. [image: 1.jpg] My question is: 1、Whether the "Unknown series references" error is the reason of causing the wal too big? 1、Under w

[prometheus-users] prometheus statefulset with thanos stop working after two days

2020-11-25 Thread 墨生
Hey guys, I'm a newbie to promethues and recently I've been working on promethues ha with thanos.But everything looks good after 2days running , here are some of the logs logs for prometheus-3: level=info ts=2020-11-26T03:45:48.626Z caller=manager.go:934 component="rule manager" msg="Rule manag

[prometheus-users] Newline in Alert description

2020-11-25 Thread Guna Kambalimath
Hello, I have my alert configuration written like the following: - name: pod-status groups: - name: pod not in expected state rules: - alert: microservice_status_alert expr: kube_pod_container_status_running == 0 for: 2m labels:

[prometheus-users] Queries for memory/cpu optimization

2020-11-25 Thread Dudi Cohen
Hi, I'm looking for queries which will calculate the optimal cpu/memory settings for an application according to past data of allocation vs actual usage, similar to what the VPA does: https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/pkg/recommender/README.md Thanks

Re: [prometheus-users] Prometheus using AWS Timestream

2020-11-25 Thread 'ellis...@googlemail.com' via Prometheus Users
I agree. I'm not too concerned about the alerting side of things as this is covered by Grafana, the recording rules help filter out the noise where possible. I've had a bash at altering the dashboard in Prometheus but it isn't that user friendly to configure, hence the swap over to Grafana.

Re: [prometheus-users] Prometheus using AWS Timestream

2020-11-25 Thread 'ellis...@googlemail.com' via Prometheus Users
and thanks Ryan. I am just in the process of building the adapter into a container at the moment and haven't tested throughput. Good to know. On Thursday, 26 November 2020 at 07:28:40 UTC ellis...@googlemail.com wrote: > I agree. I'm not too concerned about the alerting side of things as this >