[prometheus-users] Remove port number from instance value

2020-09-15 Thread kiran
Hello all, I am getting metrics correctly from netdata into prometheus with promethues.yml file below(part of the file): What do I do to not have the port number associated with the IP address in the instance label? - job_name: 'netdata' metrics_path: '/api/v1/allmetrics' params:

Re: [prometheus-users] any best practice on using limited le's for a given histogram

2020-09-15 Thread Aliaksandr Valialkin
FYI, the following article is quite interesting re histograms - https://linuxczar.net/blog/2020/08/13/histogram-error/ On Tue, Sep 15, 2020 at 10:41 PM 'Rong Hu' via Prometheus Users < prometheus-users@googlegroups.com> wrote: > We would love to learn more about the roadmap for histogram

Re: [prometheus-users] SNMP collected value shows up under label and not the metric value

2020-09-15 Thread Linkoid01
I do apologize for the tardy response. I've followed the guide posed by Brian on his webpage and the example given by Ben. It works. Thumbs up guys. Khanh, I am sorry I didn't figure out what exactly I need to change to Name. On Wednesday, August 26, 2020 at 8:32:00 AM UTC sup...@gmail.com

[prometheus-users] How to name/instrument overload metrics?

2020-09-15 Thread vteja...@gmail.com
Hi, If we consider a simple HTTP Server service, the Prometheus community recommends instrumenting total_requests and failed_requests. I was thinking of the case where the server dropped requests due to overload. How shall we treat such a scenario? - Shall we consider this scenario under

Re: [prometheus-users] any best practice on using limited le's for a given histogram

2020-09-15 Thread 'Rong Hu' via Prometheus Users
We would love to learn more about the roadmap for histogram improvements and rough timeline / estimates for earliest GA. We are trying to standardize our metrics on Prometheus internally and have lots of DDSketch histograms to migrate. In the short term we plan to roughly translate existing

[prometheus-users] Re: Metric type for basic web analytics

2020-09-15 Thread Tim Schwenke
"If you build a powerful prometheus server then a* total of 2 million timeseries* is doable; beyond that you ought to look at sharding across multiple servers." I think you have forgot a zero b.ca...@pobox.com schrieb am Dienstag, 15. September 2020 um 09:46:49 UTC+2: > On Tuesday, 15

Re: [prometheus-users] Prettifying and simplifying metrics/visualizations

2020-09-15 Thread Tim Schwenke
You can also "preaggregate" with recording rules. Though note that it is not possible to do that with counter type time series* while also* keeping them counters. Christian Hoffmann schrieb am Dienstag, 15. September 2020 um 11:14:51 UTC+2: > On 9/15/20 10:55 AM, John Dexter wrote: > > I'm

[prometheus-users] Prometheus metrics based autoscaling apart from default HPA scaling in kubernetes

2020-09-15 Thread dineshnithy...@gmail.com
Hi Team How do we achieve prometheus metrics based auto-scaling in kubernetes workloads and any best practices or pointers would be highly helpful Regards Dinesh -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this

[prometheus-users] Unable to get the example remote_storage_adapter to work to send to opentsdb

2020-09-15 Thread Brett
I'm trying to get the example storage adapter for opentsdb to work. I have prometheus sending data to an AWS instance running the remote_storage_adapter binary, and also running

Re: [prometheus-users] Prometheus disk I/O metrics

2020-09-15 Thread rsch...@gmail.com
Corrected the above expression by fixing node_disk_write_read_seconds_total to node_disk_write_time_seconds_total:- (rate( node_disk_read_time_seconds_total [5m]) + rate(node_disk_write_time_seconds_total[5m]))/(rate(node_disk_reads_completed_total[5m]) +

Re: [prometheus-users] Prometheus disk I/O metrics

2020-09-15 Thread rsch...@gmail.com
Thanks for quick response. Few clarifications:- 1) Is below calculation is right to get equivalent of "system.io.await" ? (rate( node_disk_read_time_seconds_total [5m]) + rate(node_disk_read_time_seconds_total[5m]))/(rate(node_disk_reads_completed_total[5m]) +

[prometheus-users] Re: mapping ip address to host name

2020-09-15 Thread Johny
Yes sorry I meant labels. e.g. metric_host="193.44" If you're talking about labels, then you could: - modify the exporter to work in the way you want it to (for example, add a new label saying what data centre it is running in) > its a third party exporter and its not feasible to modify it.

[prometheus-users] Re: mapping ip address to host name

2020-09-15 Thread Brian Candler
Do you really mean they have IP addresses in *values*, or do you mean in *labels* ? A value is a float64 number; it would be possible to put the 32 bits of an IPv4 address in there, but it would be weird. If you're talking about labels, then you could: - modify the exporter to work in the way

[prometheus-users] mapping ip address to host name

2020-09-15 Thread Johny
I've exporters for some components such as Redis that give IP addresses in values. I need to be able to map IP addresses to actual host names for my query and alert conditions, e.g. to verify master-slave in Redis are in different data centers. How can I fetch metric in prometheus to map a

Re: [prometheus-users] prometheus delete old data files

2020-09-15 Thread Johny
Great, thanks. I will make this change and verify the behavior. On Sunday, September 13, 2020 at 3:28:46 AM UTC-4 sup...@gmail.com wrote: > TSDB blocks are automatically cleaned up, but it does this on the 2 hour > block management schedule. Blocks also must be fully expired (maxTime) > before

[prometheus-users] Re: Scraping multiple entries of the same metric with different values

2020-09-15 Thread Panem78
Thanks a lot for your answer. This completely made it clear for me. On Tuesday, September 15, 2020 at 4:27:45 PM UTC+3 b.ca...@pobox.com wrote: > Each timeseries has to have a different set of labels. You've returned > the same set of labels three times, so this is the same timeseries

[prometheus-users] Re: Scraping multiple entries of the same metric with different values

2020-09-15 Thread Brian Candler
Each timeseries has to have a different set of labels. You've returned the same set of labels three times, so this is the same timeseries repeated three times, and prometheus rejects the additional data as duplicate. You cannot "back fill" values in Prometheus. That is, you cannot export

[prometheus-users] Scraping multiple entries of the same metric with different values

2020-09-15 Thread Panem78
Hello everyone ! I have the following question: I have created a flask application that upon request in it's /metrics endpoint, retrieves specific values (strings) from Redis in the form *"test_service_metric{service="ui",component="graphs",env="mm"} 180"* , concatenates them according to

[prometheus-users] Re: Alert once a day

2020-09-15 Thread Aleksandar Ilic
Thanks a lot for your advice and help. Will try it out. Best Regards On Tuesday, September 15, 2020 at 10:36:13 AM UTC+2 b.ca...@pobox.com wrote: > Well, it depends what you're trying to do. At the moment, you have > > > - match: > > alertname:Watchdog > > receiver: slack >

[prometheus-users] metrics monitor for ruby framework

2020-09-15 Thread timothy pember
Good day, We use ruby framework such as Rails for web development. Do you know how we can implement metrics monitor with Prometheus within the framework? Thanks. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this

[prometheus-users] Re: How to keep metric value unchange after Springboot application restart

2020-09-15 Thread Brian Candler
You could: - Use an external counter, such as statsd_exporter - Persist the counters during shutdown, and reload them during restart You should however note that the *absolute* values of counters are, on the whole, meaningless. If the counter was 1 million yesterday and 1.1 million today, that

[prometheus-users] How to keep metric value unchange after Springboot application restart

2020-09-15 Thread Daxiang Li
My SpringBoot application needs to be restarted every few days. I want to use the counter indicator to count the growth of the number of visits over a long period of time (months). But every time you restart, the counter indicator will reset to start from 0. What can I do to achieve this goal?

Re: [prometheus-users] Prettifying and simplifying metrics/visualizations

2020-09-15 Thread Christian Hoffmann
On 9/15/20 10:55 AM, John Dexter wrote: > I'm still finding my feet with Prometheus and one thing that is a bit > awkward is that time-series names are pretty cumbersome. We want a > customer-facing dashboard so let's say I want to monitor network activity: > > rate(windows_net_packets_total[2m])

[prometheus-users] Prettifying and simplifying metrics/visualizations

2020-09-15 Thread John Dexter
I'm still finding my feet with Prometheus and one thing that is a bit awkward is that time-series names are pretty cumbersome. We want a customer-facing dashboard so let's say I want to monitor network activity: rate(windows_net_packets_total[2m]) What is displayed is:

[prometheus-users] Re: Alert once a day

2020-09-15 Thread Brian Candler
Well, it depends what you're trying to do. At the moment, you have - match: alertname:Watchdog receiver: slack but that doesn't do anything useful, because the default is also to send to receiver "slack"; I don't know what you're trying to achieve by matching on the

Re: [prometheus-users] Prometheus disk I/O metrics

2020-09-15 Thread Brian Candler
Sorry, I have no idea what metrics Nginx exports or what Lua scripts in Nginx can do. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: [prometheus-users] Prometheus disk I/O metrics

2020-09-15 Thread Wesley Peng
Brian, Do you know if we can implement a Lua exporter within nginx who take application's APM and report to prometheus? Thank you. Brian Candler wrote: Just to add, the data collected by node_exporter maps closely to the raw stats exposed by the kernel, so the kernel documentation is

[prometheus-users] Re: Metric type for basic web analytics

2020-09-15 Thread Brian Candler
On Tuesday, 15 September 2020 06:11:48 UTC+1, Nick wrote: > > Keeping cardinality explosion in mind, what's a decent maximum number of > exported metrics that can be considered performant for scraping and > time-series processing? It depends on how much resource you're prepared to throw at it.

[prometheus-users] Re: Alert once a day

2020-09-15 Thread Aleksandar Ilic
Hello, Only my daily alert has both daily and watchdog rest of the alerts have other tags.Guess my understanding that if they both match they wouldn’t take the default value. Is there any other way i could my write the alert so i could get the needed results? Best Regards On Tuesday,

[prometheus-users] Re: expose data from Prometheus

2020-09-15 Thread Brian Candler
That is a k8s question, not a prometheus question. In short: Ingress controllers are how you'd expose *any* HTTP(S) service running in your k8s cluster to the outside world. Examples include Traefik, Nginx,

Re: [prometheus-users] Prometheus disk I/O metrics

2020-09-15 Thread Brian Candler
Just to add, the data collected by node_exporter maps closely to the raw stats exposed by the kernel, so the kernel documentation is helpful: https://www.kernel.org/doc/Documentation/iostats.txt https://www.kernel.org/doc/html/latest/admin-guide/iostats.html -- You received this message because

[prometheus-users] Re: How to capture multiple values in one metric

2020-09-15 Thread Brian Candler
Can you given some examples of the type of query you want to do? If they of the form "what's the latency that 95% of requests complete within?" then you could use a "summary" instead of "histogram". See: https://prometheus.io/docs/practices/histograms/ However if you genuinely want to "capture

[prometheus-users] Re: Alert once a day

2020-09-15 Thread Brian Candler
What labels does your test alert have? The first rule which matches, wins(*). So if your alert has both "frequency: daily" and "alertname: Watchdog" labels then it will match the first route, and inherit the default repeat_interval of 10m. (*) Unless you set "continue: true", but then the

Re: [prometheus-users] Re: Metric type for basic web analytics

2020-09-15 Thread Stuart Clark
On 15/09/2020 06:11, Nick wrote: Keeping cardinality explosion in mind, what's a decent maximum number of exported metrics that can be considered performant for scraping and time-series processing? As I mainly need the counter total, I can split the web analytics to reduce the number of

Re: [prometheus-users] Prometheus disconnected data recovery

2020-09-15 Thread Stuart Clark
On 15/09/2020 04:16, tiecheng shen wrote: Hello, I am a newbie to prometheus. I have a requirement. When the prometheus server and the captured client are disconnected from the network, prometheus cannot capture the data when the network is disconnected, and the graph will be disconnected when

Re: [prometheus-users] Re: Prometheus.service status failed

2020-09-15 Thread Suryaprakash Kancharlapalli
Thank you Brian, version 2.21 worked for me On Mon, Sep 14, 2020, 8:47 PM Brian Candler wrote: > The article looks fine, it's just very old. Replace 2.3.2 with latest > version 2.21.0 from https://github.com/prometheus/prometheus/releases > > One of the comments says that using multiple

[prometheus-users] Re: 1st service down alert repeating when 2nd service down after few minutes

2020-09-15 Thread Sandeep Rao Kokkirala
Thanks Brian . it's working On Monday, September 14, 2020 at 7:24:31 PM UTC+8 b.ca...@pobox.com wrote: > On Monday, 14 September 2020 11:38:43 UTC+1, Sandeep Rao Kokkirala wrote: >> >> consider 1st service is down ..our alertmanager is triggers the alert >> ...when 2nd service is down after