Re: [prometheus-users] Disable remote write retry

2020-09-12 Thread Bartłomiej Płotka
Hey, Unless there is some bug on the receiving side (maybe your front proxy masking the actual status code) or Cortex - both Cortex and Thanos Receive in cases of not accepting write for reasons like this (something that there is no point retrying for) returns the status code that tells Prometheus

[prometheus-users] prometheus delete old data files

2020-09-12 Thread Johny
I am reducing data retention from 20 days to 10 days in my Prometheus nodes (v. 2.17). When I change *storage.tsdb.retention.time *to 10d and restart my instances, this does not get delete data older than 10 days. Is there a command to force cleanup? In general, what is best practice to delete

Re: [prometheus-users] Re: Memory usage discrepancy

2020-09-12 Thread Bartłomiej Płotka
Hey, I wrote about this here https://www.bwplotka.dev/2019/golang-memory-monitoring/ at some point as well. Unless something changes by the exporter that actually gives your this metric (it's not Prometheus question, Prometheus is merely collecting those from other exporters) container_memory_work

[prometheus-users] Question on promtool and amtool

2020-09-12 Thread kiran
Hello all 1. Is promtool automatically installed with Prometheus? 2. If Prometheus is installed in a docker container, how to use promtool to validate Prometheus.yml file 3. Is amtool automatically installed with alertmanager? 4. If alertmanager is installed in a docker container, how to use amtoo

[prometheus-users] Disable remote write retry

2020-09-12 Thread Ruben Papovyan
Hi team, What are the options to disable remote write retry ? Can I use following config to disable remote write retry ? ``` remote_write: url: http://cortex.local.int queue_config: min_backoff: 2h max_backoff: 2h ``` or if I need to retry 4 times can I use config ? ``` remote_write:

Re: [prometheus-users] Re: Memory usage discrepancy

2020-09-12 Thread mspr...@us.ibm.com
Welcome to the club. https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/golang-nuts/LsOYrYc_Occ/LbjLAsL6BwAJ On Friday, September 11, 2020 at 5:26:19 AM UTC-4 Anoop wrote: > HI, > > Anyone have any suggestions on this? > > > > Thanks & Regards, > > Anoop Mohan > > Mob# +91-7

Re: [prometheus-users] Active alertnames from Prometheus alertmanager

2020-09-12 Thread Harald Koch
On Fri, Sep 11, 2020, at 08:56, neel patel wrote: > > But is there any way we can display all the active alerts with names in the > grafana dashboard in tabular format, how to fetch those data from the alert > manager ? I use this: https://github.com/camptocamp/grafana-prometheus-alertmanager-

[prometheus-users] Implementing SLO alerting with Prometheus only

2020-09-12 Thread Martin Chodúr
Hi, for over a year we are using SLO computation and alerting based on Prometheus( and Thanos). We learned quite a lot of things which were not so obvious from the SRE Workbook, which we followed, and we decided to share some of those findings in a series of blogposts. Few days ago we publish

Re: [prometheus-users] How to keep data of removed hosts?

2020-09-12 Thread harkis...@timescale.com
Prometheus does not delete metrics by itself. I think taking the querying window to a day back should give you the results, provided you provide the correct metric nanme and labels. On Friday, September 11, 2020 at 4:57:56 PM UTC+5:30 Stuart Clark wrote: > On 2020-09-11 11:03, Martin Emrich wro

Re: [prometheus-users] Re: Memory usage discrepancy

2020-09-12 Thread Harkishen@Timescale
how about using *node:node_memory_utilisation:ratio * 100* ? On Friday, September 11, 2020 at 2:56:19 PM UTC+5:30 Anoop wrote: > HI, > > Anyone have any suggestions on this? > > > > Thanks & Regards, > > Anoop Mohan > > Mob# +91-7293009486 <+91%2072930%2009486> > E-mail : anoopmo...@gmail.com > >

[prometheus-users] Re: Grafana dashboard for Prometheus itself

2020-09-12 Thread Brian Candler
There is a newer dashboard linked in this recent posting to the group: https://groups.google.com/d/msg/prometheus-users/yxKi1jfQ7GE/Rp5HwhXbAwAJ -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receivin

[prometheus-users] Re: Alertmanager status component=silences msg="Running maintenance failed" permission denied

2020-09-12 Thread Brian Candler
I'm stating the obvious here, but it looks like a permissions problem on that directory or file. At minimum you should: 1. identify what uid alertmanager is running as, and what working directory it's running in 2. check permissions on the "data" subdirectory, and all files in that directory

Re: [prometheus-users] Prometheus Latest Version Upgrade - Help/Guidance Needed

2020-09-12 Thread Dinesh N
Can someone please guide me through on this hassle On Thu, 10 Sep, 2020, 8:25 pm Dinesh N, wrote: > Ben - This issue happens under heavy load - there is a ticket already in > GitHub and acknowledged by some of them having similar issue - > https://github.com/prometheus/prometheus/issues/6139 >

[prometheus-users] Re: Metric type for basic web analytics

2020-09-12 Thread Nick
Thanks, that makes sense. On Friday, 11 September 2020 at 21:29:34 UTC+10 b.ca...@pobox.com wrote: > The metric type you want is "counter", which is incremented on each hit. > > However you need to be careful here, as you may end up with a cardinality > explosion if you label your metrics with {