[prometheus-users] Aggregation Metrics - Found duplicate series for the match group (How delete a label before join metrics ?)

2020-03-10 Thread BDT
Hi everyone, Today I have a problem about my rules expression because I try to join metrics together to get the name of the swam node in it. In order to do this, I have left_joined my metrics by node_id and get node_name. It's works fine in the prometheus console but when I deploy my rules, I g

[prometheus-users] Aggregation metrics - found duplicate series for the match group (Delete a label before join a metrics ?)

2020-03-10 Thread BDT
Hi everyone, Today I have a problem about my rules expression because I try to join metrics together to get the name of the swam node in it. In order to do this, I have left_joined my metrics by node_id and get node_name. It's works fine in the prometheus console but when I deploy my rules, I g

[prometheus-users] Re: Sum distinct values

2020-03-10 Thread mert tan
Brian, Even vmware_vm_guest_disk_free values are the same and unfortunately, the file structure is not consistence. I can't rely on them. Here is the capacity - free calculation and you can see from them there is a pointing issue. The last point I came, ( count_values ("value", vmware_vm_guest_di

[prometheus-users] Re: Sum distinct values

2020-03-10 Thread Brian Candler
You're assuming that two disks that have the same capacity, are the same disk. You can't be sure that's true. The way you tell two metrics apart is that they have different labels, not different values. What I *think* is going on is you have a load of alias mounts, where an existing filesyste

[prometheus-users] Re: metrics from cloudwatch_exporter are not digested

2020-03-10 Thread Moses Moore
Okay I figured out what's going on, but I'm still scratching my head about how to federate the metrics gleaned by cloudwatch_exporter. The timestamps are there for a reason -- I could ask for metrics from Cloudwatch exporter and specify "300s ago please" and cloudwatch will return "well, I've g

[prometheus-users] Re: query AlertManager

2020-03-10 Thread NosIreland
My alerts are grouped based on datacenters accros the globe and I want them to be presented on worldmap dashboard in grafana Most likely I could group using promql and then represent on dashboard but that is a lot of extra work that is already done in alertmanager. I have looked into alermanager

Re: [prometheus-users] query AlertManager

2020-03-10 Thread NosIreland
Thanks for suggestion but I already looked into this and is does not work as I would like it to. On Tuesday, 10 March 2020 09:31:22 UTC, Simon Pasquier wrote: > > Grouping isn't exposed as labels in metrics because it would be too > much cardinality. > Maybe have a look at this Grafana datasour

Re: [prometheus-users] Alertmanager is sending the resolved notification everytime the current value of metric is changing

2020-03-10 Thread Brian Brazil
On Tue, 10 Mar 2020 at 17:41, Rahul Hada wrote: > Thanks for the quick response Brian, I was checking same. We are not using > Value as label, only using in annotations. Below is the alert config. > Please have a look and suggest what changes can help us achieve better > alert notifications. One

[prometheus-users] Re: Sum distinct values

2020-03-10 Thread mert tan
Thank you for your response, I'm calculating the vmware disk usage, but prometheus has no this feature at the moment, so I'm doing vmware_vm_guest_disk_capacity - vmware_vm_guest_disk_free Here is the problem, regarding the virtualization files, the files are pointing each other and it brings du

[prometheus-users] Re: Sum distinct values

2020-03-10 Thread Brian Candler
Can you share the raw metrics, i.e. vmware_vm_guest_disk_capacity{vm_name="xxx.com"} and then explain what you're trying to extract from them? -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receivi

Re: [prometheus-users] Alertmanager is sending the resolved notification everytime the current value of metric is changing

2020-03-10 Thread Rahul Hada
Thanks for the quick response Brian, I was checking same. We are not using Value as label, only using in annotations. Below is the alert config. Please have a look and suggest what changes can help us achieve better alert notifications. One more help i would request is, like below alert express

Re: [prometheus-users] Alertmanager is sending the resolved notification everytime the current value of metric is changing

2020-03-10 Thread Brian Brazil
On Tue, 10 Mar 2020 at 17:10, Rahul Hada wrote: > I have alerts configured for various metrics, and has set send_resolved : > true, now once alert is in active state, we are getting [Firing] emails as > configured, but we are getting the [resolved] email even if there is a > slight change in the

[prometheus-users] Sum distinct values

2020-03-10 Thread mert tan
Hi There, I'm having a trouble to "Exclude" the duplicate values. I need to sum of unique values. At the moment, I can find the repeated values with count_values Here is the code count_values("value", vmware_vm_guest_disk_capacity{vm_name="xxx.com"} ) So, I'm pretty new at the promql. I couldn't w

[prometheus-users] Alertmanager is sending the resolved notification everytime the current value of metric is changing

2020-03-10 Thread Rahul Hada
I have alerts configured for various metrics, and has set send_resolved : true, now once alert is in active state, we are getting [Firing] emails as configured, but we are getting the [resolved] email even if there is a slight change in the values. For ex:- we get Firing alert when disk usage i

[prometheus-users] blackbox-exporter probe failed with connection reset by peer

2020-03-10 Thread sally zhang
i have blackbox-exporter deployed and trying to monitor kubernetes services. But currently the targets are showing as down in Prometheus with error connection reset by peer Port-forward blackbox-exporter service locally, tested the probe with the defined module, all failed. Anyone could help?

[prometheus-users] Blackbox-exporter probe failed with connection reset by peer

2020-03-10 Thread sally zhang
Host OS: Linux monitoring-blackbox-exporter-66b8d48bf9-wxzmf 4.19.86-coreos #1 SMP Mon Dec 2 20:13:38 -00 2019 x86_64 GNU/Linux Blackbox_exporter version: blackbox_exporter, version 0.16.0 (branch: HEAD, revision: 991f89846ae10db22a393

[prometheus-users] Prometheus alerting rules test for counters requiring multiple day span

2020-03-10 Thread Debashish Ghosh
Hi, I have a metric regarding SLA that needs to be 99.95 % or above . I am using the formula 100-(((30*24*60*60) - increase(process_uptime_seconds{job="Interop-InboundApi"}[30d]))/(30*24*60*60))*100 that runs for15 minutes ,which means if there is any time missing between the total number of second

[prometheus-users] Getting all the checks for a server in the same dashboard.

2020-03-10 Thread Yagyansh S. Kumar
Hi. I monitoring around 2500+ servers. I am using node_exporter to collect system metrics and blackbox for Service Health Checks that are configured on those servers. Now, all the servers do not have the same health check and port. Some don't even have any health check or port configured. I want

Re: [prometheus-users] Why Prometheus recording rule's result can differ from manual query?

2020-03-10 Thread Bjoern Rabenstein
On 04.03.20 03:07, Alexandre Figura wrote: > > How would you explain such difference between results? What's your retention time? If it is 15d (the default), then the recording rule would actually see all the 2w of data you are querying over in the `rate` range (because it is executed regularly f

Re: [prometheus-users] [golang][counters] Is there a noticeable performance penalty for using a CounterVec with no labels as opposed to a Counter

2020-03-10 Thread Bjoern Rabenstein
On 03.03.20 17:13, 'Noam Hurwitz' via Prometheus Users wrote: > It's easier to locally model all counters as CounterVecs and not pass any > labels if we don't want dimensional breakdowns given Go's typing paradigms. I > want to make sure doing this is not at excess cost. AFAICS, there is currently

[prometheus-users] Getting I/O of NFS mount by mountpoint.

2020-03-10 Thread Yagyansh S. Kumar
Hi. I have a system where multiple NFS' are mounted across servers. I want the total Input and Output operations of the NFS mount based on the mountpoint(Eg. I have a NFS mounted at /data which is mounted on 100 servers. Now, I want I/O of mountpoint /data.) I have checked the stats scraped by

Re: [prometheus-users] Re: Expired silences are not deleted after --data.retention has passed

2020-03-10 Thread Simon Pasquier
Expired silences are deleted every 15 minutes. Try checking the alertmanager_silences_gc_duration_seconds_count metric. On Fri, Feb 28, 2020 at 8:59 AM jayson wang <1226266...@qq.com> wrote: > > > I had the same problem > > > 在 2020年1月19日星期日 UTC+8下午4:44:19,Tmac Han写道: >> >> Any one can help me? Th

[prometheus-users] [ANN] controlling what silences users can create using karma

2020-03-10 Thread Łukasz Mierzwa
Authentication, authorization and multi tenancy comes up a lot in questions so I've decided to take a stab at it after a feature request ( https://github.com/prymitive/karma/issues/1347). Starting with 0.56 karma will now have ability to block silences from being added to alertmanager using ACL r

[prometheus-users] Re: query AlertManager

2020-03-10 Thread Brian Candler
I'm not sure exactly what you're trying to do with grafana, but I use karma as alerting dashboard and it does a good job of showing grouped alerts, as well as making a view of multiple alertmanagers in different data centres and being able to push out global silences. -- You received this mess

Re: [prometheus-users] query AlertManager

2020-03-10 Thread Simon Pasquier
Grouping isn't exposed as labels in metrics because it would be too much cardinality. Maybe have a look at this Grafana datasource plugin: https://grafana.com/grafana/plugins/camptocamp-prometheus-alertmanager-datasource On Mon, Mar 9, 2020 at 9:29 PM NosIreland wrote: > > Hi All, > Is there a wa

Re: [prometheus-users] usage of rate function on recording metric

2020-03-10 Thread Stuart Clark
On 10/03/2020 07:47, Venkata Bhagavatula wrote: Hi Stuart, Julien, Following is being done in the application side for the metrics given in the above mails.: Some of the metrics used in these charts have multiple labels. Due to the usage of multiple labels and the possible different values o

[prometheus-users] Re: Deleting timeseries

2020-03-10 Thread Venkata Bhagavatula
Hi All, I tried the following today, if i delete few metrics individually, those metrics are deleted. But if i give expression to match all metrics then none of them are getting deleted. Can you let me know if i am missing something? Thanks n Regards, Chalapathi. On Tue, Mar 3, 2020 at 12:53 PM

[prometheus-users] Daily increase of a counter in PromQL

2020-03-10 Thread Shunsuke Kirino
Hi, In our application we have a counter that represents something like "total consumption of budget". And we also have a "limit on daily consumption of the budget". So we want to monitor (in our prometheus & grafana) increase of the budget within each day (i.e., we want the resulting time serie

Re: [prometheus-users] usage of rate function on recording metric

2020-03-10 Thread Venkata Bhagavatula
Hi Stuart, Julien, Following is being done in the application side for the metrics given in the above mails.: Some of the metrics used in these charts have multiple labels. Due to the usage of multiple labels and the possible different values of these labels, the cardinality of the metrics can be