Re: [prometheus-users] Sudden & Permanent increase in Memory consumption.

2021-02-08 Thread Ben Kochie
RSS is not a true number. It's more of an estimate than anything. The Linux kernel tracking of process memory is fairly lazy. Especially when you get into more complex things like MADV_DONTNEED vs MADV_FREE. If you really want to get a more realistic number, you can look at PSS, but it's an expens

[prometheus-users] Node exporter deployment

2021-02-08 Thread Mohan Nagandlla
Hi team I just have one dought here that is one node exporter is already running in my cluster now if I want to deploy another node exporter by helm it is showing still in pending in description it is showing 9100 is already in use can any one guide me to deploy another node exporter for the existe

Re: [prometheus-users] 2 operator conditions in single query

2021-02-08 Thread Ben Kochie
mem_used_percent{host="hostname"} >= 70 and mem_used_percent{host="hostname"} <= 80 On Tue, Feb 9, 2021 at 1:56 AM sunils...@gmail.com wrote: > I have to query servers with memory utilization between 70-80 % . > I can query them separately but how can I combine both the queries . > > I tried :

Re: [prometheus-users] Re: Collecting Cisco NetFlow metrics using snmp_exporter

2021-02-08 Thread Mohammad Zahidul Alam
Hi, I think the issue is resolved, I just needed to add two more mibs: CISCO-ES-STACK-MIB.mib and CISCO-SMI.mib On Tue, Feb 9, 2021 at 12:38 PM Mohammad Zahidul Alam < cooldude8...@gmail.com> wrote: > Hi, > > You can checkout the files from my github repo. > > https://github.com/zahid-alam/prom-

[prometheus-users] Prometheus Query function to retrieve the number of requests for last $period

2021-02-08 Thread rajasree Sekar
Hi all We have been exploring on all the Prometheus Query functions which can be useful to retrieve the number of requests for the last $period from a cumulative stats http_srv_req_count. sum(http_srv_req_count) - sum(http_srv_req_count offset $period) >= 0 Can you confirm if the below funct

[prometheus-users] Re: Collecting Cisco NetFlow metrics using snmp_exporter

2021-02-08 Thread Mohammad Zahidul Alam
Hi, You can checkout the files from my github repo. https://github.com/zahid-alam/prom-net-flow On Tuesday, February 9, 2021 at 12:30:28 PM UTC+6 Mohammad Zahidul Alam wrote: > Apparently I can't attach files to the thread. So here's just the > generator file in plain text. > > modules: > c

[prometheus-users] Re: Collecting Cisco NetFlow metrics using snmp_exporter

2021-02-08 Thread Mohammad Zahidul Alam
Apparently I can't attach files to the thread. So here's just the generator file in plain text. modules: cisco_netflow: auth: community: public max_repetitions: 25 retries: 3 timeout: 10s version: 2 walk: - 1.3.6.1.4.1.9.9.387 -- You received this message bec

[prometheus-users] Collecting Cisco NetFlow metrics using snmp_exporter

2021-02-08 Thread Mohammad Zahidul Alam
Hi all, I'm experimenting with Prometheus for setting up new monitoring system for my company. Recently a new requirement came that we need to monitor NetFlow data of Cisco Devices. So I downloaded Cisco NetFlow MIBs and tried to generate snmp.yml using the generator. But I'm getting the follow

[prometheus-users] 2 operator conditions in single query

2021-02-08 Thread sunils...@gmail.com
I have to query servers with memory utilization between 70-80 % . I can query them separately but how can I combine both the queries . I tried : mem_used_percent{host="hostname"} >= 70 mem_used_percent{host="hostname"} <= 80 -- You received this message because you are subscribed to the Goog

[prometheus-users] Re: Prometheus keep last value when metric with label disappear

2021-02-08 Thread Matt Palmer
On Sun, Feb 07, 2021 at 09:53:28PM -0800, Andrej Dorinec wrote: > I encounter an issue with metrics that sometimes disappear. I am using SQL > exporter for extracting data from > database and create metrics from them. I am query the queue size and there > i

[prometheus-users] PromQL Query Using Hour Function

2021-02-08 Thread Chad Thielen
Hello, I'm having some trouble writing a query for an alerting rule that we only want to run at certain times of the day. This is what I original had, which works, but not in the way we now need: > count(up{job="calculator"} == 0) and count(hour(vector(time())) >= 1 and hour(vector(time())) <

Re: [prometheus-users] Sudden & Permanent increase in Memory consumption.

2021-02-08 Thread yagyans...@gmail.com
Also, I had a look at Go memory utilization. I see that Go memory utilization(go_memstats_alloc_bytes) is around 50% of the total memory used by Prometheus(process_resident_memory_bytes) On Tuesday, February 9, 2021 at 2:39:42 AM UTC+5:30 yagyans...@gmail.com wrote: > Thanks, Ben. I'll upgra

Re: [prometheus-users] Sudden & Permanent increase in Memory consumption.

2021-02-08 Thread Yagyansh S. Kumar
Thanks, Ben. I'll upgrade to a newer Prometheus version and check if the issue still persists. But I still have one doubt here, I am running this Prometheus instance for almost an year now, but I have noticed this memory increase recently only. First time on 31st Jan and 2nd time on Feb 8. If it r

Re: [prometheus-users] Sudden & Permanent increase in Memory consumption.

2021-02-08 Thread Ben Kochie
Prometheus performs compactions at regular intervals. This is likely what generated some IO. Note, if you're just looking at RSS, this is not going to tell the whole story. Depending on which version of Go you built with, it may not be fast at reclaiming RSS memory. Look at the go_memstats_alloc_

[prometheus-users] Sudden & Permanent increase in Memory consumption.

2021-02-08 Thread yagyans...@gmail.com
Hi. I am using Prometheus version 2.12.0. I am running Alertmanager 0.21.0 in cluster mode. Since, last 9 days, I have observed twice that the memory consumption by Prometheus increased by 10-12% and it remained to the increased value there after. Interesting thing to note here is that both th

[prometheus-users] Re: unit test passes with and without alert

2021-02-08 Thread rbauduin
Here is a even simpler illustration: # alert.yml definition: trigger is metric is < 1 "groups": - "name": "prom-alerts" "rules": - "alert": "NoServer" "annotations": "message": "There is no server." "expr": "metric < 1" "labels": "severity": "error" # test.yml: all 0 s

Re: [prometheus-users] How to config Basic Auth with File-based Service Discovery?

2021-02-08 Thread Ben Kochie
That won't work for other discovery methods. We could add a new concept of "secret labels" that would be masked in the UI. On Mon, Feb 8, 2021 at 3:56 AM Ben Teitelbaum wrote: > Instead of setting HTTP basic auth username and password as labels in the > file discovery YAML or JSON, couldn't th

[prometheus-users] unit test passes with and without alert

2021-02-08 Thread rbauduin
Hi, I am writing unit tests for prometheus alert as described at https://prometheus.io/docs/prometheus/latest/configuration/unit_testing_rules/ I have an alert defined to trigger after 5 minutes of an expression being <1 (See below for its definition). I am defining the series values as 1+0x4