[prometheus-users] divide metrics by instance

2020-06-14 Thread Yashar Nesabian
Hi I want to calculate Postgres locks using Postgres exporter, I found this template here : - alert: PostgresqlTooManyLocksAcquired expr: ((sum (pg_locks_count)) / (pg_settings_max_locks_per_transaction * pg_settings_max_connections)) > 0.20

Re: [prometheus-users] What does prometheus do when querying with min step in 10s but the scrape interval is 15s?

2020-06-14 Thread Ray Wu
To be clear, I'm talking about the "Min step" in here https://grafana.com/docs/grafana/latest/features/datasources/prometheus/#prometheus-query-editor which maps to "step: Query resolution step width in duration format or float number of seconds." in range query API. The 3x, 4x sounds like apply

Re: [prometheus-users] What does prometheus do when querying with min step in 10s but the scrape interval is 15s?

2020-06-14 Thread Ben Kochie
Generally, if you're using $__interval in Grafana, you want to have a min step that's 3x or 4x your scrape interval. This allows the rate() function to handle missed scrapes and counter resets better. It definitely doesn't make sense to have the min step finer than your interval. You need 2x the i

Re: [prometheus-users] Re: Divide node_exporter loadavg by CPU count doesn't show anything

2020-06-14 Thread Ray Wu
I see, thanks! On Sunday, June 14, 2020 at 12:54:33 AM UTC-7, Ben Kochie wrote: > > You need to remove both cpu and mode for the labels to match: > > node_load1 / count without (cpu,mode) (node_cpu_seconds_total{mode="idle"}) > > On Sun, Jun 14, 2020 at 1:16 AM Ray Wu > > wrote: > >> While readin

[prometheus-users] What does prometheus do when querying with min step in 10s but the scrape interval is 15s?

2020-06-14 Thread Ray Wu
I use grafana to plot some system metrics from node_exporter. I set my prometheus scrape interval to 15s and grafana's default min step is 10s. Does that make sense to plot in the finer granularity than the sampling rate at all? Is there any doc explain how does Prometheus calculate this? -- Yo

[prometheus-users] Re: where to find alertmanager logs

2020-06-14 Thread Mohamed Amine Ait Mbarek
Hi Sir, I'm having the same problem, I'm running Alertmanager on Docker, I have alerts firing in alertmanager and I can send notifications via Slack but I can't send emails. Can you please tell me if you found out a solution ? Thank you -- You received this message because you are subscribe

Re: [prometheus-users] Preventing data loss from poor network communication

2020-06-14 Thread Stuart Clark
What you'd generally do is look at using federation or one of the global storage systems like Victoria Metrics, Thanos or Cortex. You'd have a Prometheus server in each location, and then central systems for global views and alerts. On 14 June 2020 12:19:43 BST, "Mathieu Tétreault" wrote: >I

Re: [prometheus-users] Preventing data loss from poor network communication

2020-06-14 Thread Mathieu Tétreault
I will have to double check, at first glance, the metrics servers didn't have enough resources available to run prometheus alongside their application. That's the main reason why I started to investigate setting up a watchdog setup and the pushgateway. My understanding is that it will also prevent

Re: [prometheus-users] PO has decided to replace our Prometheus stack with AppDynamics

2020-06-14 Thread Simon Lyall
On Sun, 14 Jun 2020, 'Jason' via Prometheus Users wrote: On 14.06.20 06:26, Andy Kruta wrote: They're two completely different toolsets that provide different functionalities.  AppDynamics is much like New Relic.  It measures the performance from the end-user perspective, not from the system p

Re: [prometheus-users] PO has decided to replace our Prometheus stack with AppDynamics

2020-06-14 Thread 'Jason' via Prometheus Users
On 14.06.20 06:26, Andy Kruta wrote: They're two completely different toolsets that provide different functionalities.  AppDynamics is much like New Relic.  It measures the performance from the end-user perspective, not from the system perspective.  Both are really needed. Are you sure on tha

Re: [prometheus-users] PO has decided to replace our Prometheus stack with AppDynamics

2020-06-14 Thread Ben Kochie
I don't have much comment on AppDynamics, as I've never used it. But maybe what your PO is trying to get is something you're missing from the current Prometheus deployment. As you say, you're using probes for most of your monitoring. What you're missing is instrumenting your java application dire

[prometheus-users] 0 response code on WATCH against the kubernetes api server

2020-06-14 Thread Erez Rabih
We recently upgraded our kubernetes control plane from version 1.13 to 1.16 After doing that, we started seeing 0 (zero) response codes on the prometheus reporting dashboard for WATCH operations against the apiserver. The metric we're using is sum(rate(apiserver_request_count{code!~"^2.*$"}[1m

Re: [prometheus-users] Re: Divide node_exporter loadavg by CPU count doesn't show anything

2020-06-14 Thread Ben Kochie
You need to remove both cpu and mode for the labels to match: node_load1 / count without (cpu,mode) (node_cpu_seconds_total{mode="idle"}) On Sun, Jun 14, 2020 at 1:16 AM Ray Wu wrote: > While reading other document, I found that both time series need to have > matching labels. > I have an extra

Re: [prometheus-users] How to store my own key-value pairs in Prometheus?

2020-06-14 Thread Stuart Clark
On 14/06/2020 01:05, Aviral Srivastava wrote: Prometheus stores time-series data by default. In this default model, the x-axis is time and the y-axis is the value. I want x-axis to be a number(1,2,3,) and y-axis to be of some value(1000, 2000, 3000, ). How do I store that in Promethe