[prometheus-users] Is it possible to find out how many samples are used to serve a single Prometheus Query?

2022-02-08 Thread 'ping...@hioscar.com' via Prometheus Users
Or queries on average? Thanks, -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the

[prometheus-users] Re: Query log isn't very helpful in finding queries that crash server

2022-01-18 Thread 'ping...@hioscar.com' via Prometheus Users
To address this, logging query start and end events separately would help. On Tuesday, January 18, 2022 at 10:41:25 PM UTC-5 ping...@hioscar.com wrote: > It seems to me that because query log records end time, it doesn't log > queries until they are finished. So, if server runs into OOM and cras

[prometheus-users] Query log isn't very helpful in finding queries that crash server

2022-01-18 Thread 'ping...@hioscar.com' via Prometheus Users
It seems to me that because query log records end time, it doesn't log queries until they are finished. So, if server runs into OOM and crashes due to some expensive queries, the offending queries will never be logged in query log. { "params": { "end": "2022-01-19T03:37:34.316Z", "que

[prometheus-users] Prometheus crash due to OOM

2022-01-05 Thread 'ping...@hioscar.com' via Prometheus Users
Hi, We are running Prometheus 2.25.0. We have been running into issues with expensive queries causing prometheus service to crash. We are giving it 64GB ram. We have aggressively limited query timeout to 1m and query.max-samples to 10,000,000 (20% of default value), which based on my reading

Re: [prometheus-users] How-To debug prometheus_rule_evaluation_failures_total? Prometheus is failing rule evaluations

2021-05-13 Thread 'ping...@hioscar.com' via Prometheus Users
We are facing the issue where rules fail sporadically time to time. Are these errors logged somewhere if they cannot be found on UI? Thanks On Saturday, May 1, 2021 at 11:01:49 AM UTC-4 matt...@prometheus.io wrote: > That looks good, I think the issue is which target(s) you discover for > these

[prometheus-users] query.timeout doesn't work properly?

2021-04-13 Thread 'ping...@hioscar.com' via Prometheus Users
We have --query.timeout as 2m. However, in query.log, we see queries with execTotalTime way over 2m, like 5m. "stats": { "timings": { "evalTotalTime": 343.356913903, "resultSortTime": 0, "queryPreparationTime": 0.243052229, "innerEvalTime": 0, "e

[prometheus-users] "/alerts" endpoint takes forever to load on 2.25.0

2021-03-16 Thread 'ping...@hioscar.com' via Prometheus Users
After upgrading to 2.25.0 (from 2.11.1), the "/alerts" endpoint takes forever to load. We do have thousands of alerts. But it loaded fine before the upgrade. Any advice? Thanks -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscrib

Re: [prometheus-users] PromQL formatter?

2021-03-12 Thread 'ping...@hioscar.com' via Prometheus Users
Anyone found a good formatter yet? Thanks On Tuesday, September 15, 2020 at 1:24:07 AM UTC-4 ping...@hioscar.com wrote: > Thanks. Before that's shipped, I'm hoping I can find a formatter that can > do similar things, i.e. as simple as "adding line breaks and proper > indentations after each (

[prometheus-users] Alert on no alerts?

2021-01-30 Thread 'ping...@hioscar.com' via Prometheus Users
Is there a way for AlertManager to fire an alert no receiving no alerts in the last like 10 mins? We recently experienced an issue when prometheus got misconfigured and stopped sending alerts, including alerts on itself. So, to detect this scenario, we cannot rely on prometheus alert rules. I ho

Re: [prometheus-users] PromQL formatter?

2020-09-14 Thread 'ping...@hioscar.com' via Prometheus Users
Thanks. Before that's shipped, I'm hoping I can find a formatter that can do similar things, i.e. as simple as "adding line breaks and proper indentations after each ( or {." On Monday, September 14, 2020 at 4:59:12 PM UTC-4 sup...@gmail.com wrote: > There's a work-in-progress design doc for th

[prometheus-users] Using time range formats as integer values in PromQL?

2020-09-14 Thread 'ping...@hioscar.com' via Prometheus Users
Is it by any chance possible to use time range formats like 1d, 30m, 15s in PromQL and automatically convert them to integer values in the unit of seconds? Meaning, they can be used in conditions like, time() - metric_timestamp{name="foo"} > 1d Thanks! -- You received this message because you

[prometheus-users] PromQL formatter?

2020-09-14 Thread 'ping...@hioscar.com' via Prometheus Users
What's a good formatter you guys recommend for formatting complex queries? Ideally with line breaks and proper indentations after each ( or {. Thanks! -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop

Re: [prometheus-users] Re: (Alertmanager) Ignore instance label to prevent same alert multiple times

2020-06-04 Thread 'ping...@hioscar.com' via Prometheus Users
Thanks for your replies, guys. We have two replicated prometheus instances scraping the same metrics and sending the same alerts in parallel to alertmanager. We add a label to alerts indicating which prometheus instance the alert is fired from, so that if one prometheus instance is going bad we

[prometheus-users] Re: (Alertmanager) Ignore instance label to prevent same alert multiple times

2020-06-04 Thread 'ping...@hioscar.com' via Prometheus Users
+1 We get the same alert multiple times in the same email, because the monitor label (prometheus instance) being different for our simple replicated setup. Would be nice to be able to ignore certain labels so that alert bodies are higher signal. On Thursday, May 9, 2019 at 11:29:58 AM UTC-4, t