date:20200915

[prometheus-users] Remove port number from instance value

2020-09-15 Thread kiran

Hello all,

I am getting metrics correctly from netdata into prometheus with
promethues.yml file below(part of the file):
What do I do to not have the port number associated with the IP address in
the instance label?

  - job_name: 'netdata'

metrics_path: '/api/v1/allmetrics'


params:

  format: [prometheus]

honor_labels: true


file_sd_configs:

- files:

  - 'nodes.yaml'

relabel_configs:

  - source_labels: [__address__]

regex: (.*):(9100)

target_label: __param_target

replacement:   '${1}:1'

  - source_labels: [__address__]

regex: (.*):(9100)

target_label: instance

replacement:   '${1}:1'

  - source_labels: [__param_target]

regex: (.*)

target_label: __address__

replacement:   '${1}'

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAOnWYZUP%3DEA%3DmAnaxDYxrdP8kj%3DcWpaGUgzLJrhFs85AoQRSKg%40mail.gmail.com.

Re: [prometheus-users] any best practice on using limited le's for a given histogram

2020-09-15 Thread Aliaksandr Valialkin

FYI, the following article is quite interesting re histograms -
https://linuxczar.net/blog/2020/08/13/histogram-error/

On Tue, Sep 15, 2020 at 10:41 PM 'Rong Hu' via Prometheus Users <
prometheus-users@googlegroups.com> wrote:

> We would love to learn more about the roadmap for histogram improvements
> and rough timeline / estimates for earliest GA. We are trying to
> standardize our metrics on Prometheus internally and have lots of DDSketch
> histograms to migrate. In the short term we plan to roughly translate
> existing DDSketch buckets to default histogram buckets. It would greatly
> incentivize migration internally if the feature gap is filled.
> Thank you for doing this valuable work!
>
> Rong Hu
> Airbnb
>
> On Tuesday, September 8, 2020 at 1:56:49 PM UTC-7 bjo...@rabenste.in
> wrote:
>
>> On 02.09.20 00:38, rs vas wrote:
>> >
>> > • any good number we can cross when defining buckets for example not to
>> > define more than 10 le's.
>>
>> It all really depends on your total cardinality. It's fine to create a
>> histogram with loads of buckets if that's only exposed on three
>> targets and has no further labels at all.
>>
>> In your case, where you have many hosts _and_ partitioning by a bunch
>> of other labels with some significant cardinality, too, you really
>> have to be careful with the number of buckets.
>>
>> A common pattern for something like HTTP request metrics is to have a
>> counter with many labels (like method, path, status code, ...) and
>> then a histogram for the request duration with no further labels (or
>> at least only a few with low cardinality). In that way, you cannot
>> calculate latency per status code and such, but it might be a good
>> compromise.
>>
>> In different news, I'm working on ways to allow high-res histograms in
>> the future, see
>>
>> https://grafana.com/blog/2020/08/24/kubecon-cloudnativecon-eu-recap-better-histograms-for-prometheus/
>> for a bunch of links to talks etc.
>>
>> --
>> Björn Rabenstein
>> [PGP-ID] 0x851C3DA17D748D03
>> [email] bjo...@rabenste.in
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/d17ea97b-06ba-496d-bf47-87a31743e3ebn%40googlegroups.com
> 
> .
>


-- 
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAPbKnmA1ZAOviEekNTo6aM0AwooEOgFtO%2BwmKA4uNpVGaBVjPQ%40mail.gmail.com.

Re: [prometheus-users] SNMP collected value shows up under label and not the metric value

2020-09-15 Thread Linkoid01

I do apologize for the tardy response. I've followed the guide posed by 
Brian on his webpage and the example given by Ben. It works. Thumbs up guys.
Khanh, I am sorry I didn't figure out what exactly I need to change to Name.

On Wednesday, August 26, 2020 at 8:32:00 AM UTC sup...@gmail.com wrote:

> Here's an example for this MIB:
>
> https://github.com/SuperQ/tools/tree/master/snmp_exporter/hp
>
> On Wed, Aug 26, 2020 at 10:11 AM Brian Candler  wrote:
>
>> On Sunday, 16 August 2020 13:00:13 UTC+1, Linkoid01 wrote:
>>>
>>> I've manually changed the snmp.yml file from OctetString to 
>>> DisplayString and I am able to see the values and not a hex representation. 
>>> I've checked the DELL-RAC-MIB and the file is correct, like you said 
>>> DellPowerReading ::= DisplayString (SIZE (0..32)).
>>> How do I get to place the power reading as a value?
>>>
>>
>>
>> https://www.robustperception.io/numbers-from-displaystrings-with-the-snmp_exporter
>>
>> -- 
>>
> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>>
> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-use...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/475441eb-a5d4-4c93-b04a-1dd96e434141o%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/c47c5096-92d3-45d1-9d69-469b6b8cd882n%40googlegroups.com.

[prometheus-users] How to name/instrument overload metrics?

2020-09-15 Thread vteja...@gmail.com

Hi,

If we consider a simple HTTP Server service, the Prometheus community 
recommends instrumenting total_requests and failed_requests. I was thinking 
of the case where the server dropped requests due to overload. How shall we 
treat such a scenario?

   - Shall we consider this scenario under failed_request and identify it 
   using special label-value like condition: overload
   - Shall we instrument a new metric like http_dropped_requests

When I look for k8s for inspiration, they have total requests, failed 
requests, and dropped requests. But I'm not sure which pattern is the 
best-preferred pattern in terms of querying, visualizing, and alerting.

In general, is it a good pattern to have a lot of metrics or a few needed 
ones with good instrumentation (label based differentiation).

Keeping golden principles in mind, overload related metrics do fall user 
RED category. But I'm not sure if I need to put them under "R" or "E".

Thanks,
Teja

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/5bfc78c3-556b-40a7-8784-aab12956d232n%40googlegroups.com.

Re: [prometheus-users] any best practice on using limited le's for a given histogram

2020-09-15 Thread 'Rong Hu' via Prometheus Users

We would love to learn more about the roadmap for histogram improvements 
and rough timeline / estimates for earliest GA. We are trying to 
standardize our metrics on Prometheus internally and have lots of DDSketch 
histograms to migrate. In the short term we plan to roughly translate 
existing DDSketch buckets to default histogram buckets. It would greatly 
incentivize migration internally if the feature gap is filled.
Thank you for doing this valuable work! 

Rong Hu
Airbnb

On Tuesday, September 8, 2020 at 1:56:49 PM UTC-7 bjo...@rabenste.in wrote:

> On 02.09.20 00:38, rs vas wrote: 
> > 
> > • any good number we can cross when defining buckets for example not to 
> > define more than 10 le's. 
>
> It all really depends on your total cardinality. It's fine to create a 
> histogram with loads of buckets if that's only exposed on three 
> targets and has no further labels at all. 
>
> In your case, where you have many hosts _and_ partitioning by a bunch 
> of other labels with some significant cardinality, too, you really 
> have to be careful with the number of buckets. 
>
> A common pattern for something like HTTP request metrics is to have a 
> counter with many labels (like method, path, status code, ...) and 
> then a histogram for the request duration with no further labels (or 
> at least only a few with low cardinality). In that way, you cannot 
> calculate latency per status code and such, but it might be a good 
> compromise. 
>
> In different news, I'm working on ways to allow high-res histograms in 
> the future, see 
>
> https://grafana.com/blog/2020/08/24/kubecon-cloudnativecon-eu-recap-better-histograms-for-prometheus/
>  
> for a bunch of links to talks etc. 
>
> -- 
> Björn Rabenstein 
> [PGP-ID] 0x851C3DA17D748D03 
> [email] bjo...@rabenste.in 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/d17ea97b-06ba-496d-bf47-87a31743e3ebn%40googlegroups.com.

[prometheus-users] Re: Metric type for basic web analytics

2020-09-15 Thread Tim Schwenke

"If you build a powerful prometheus server then a* total of 2 million 
timeseries* is doable; beyond that you ought to look at sharding across 
multiple servers."

I think you have forgot a zero

b.ca...@pobox.com schrieb am Dienstag, 15. September 2020 um 09:46:49 UTC+2:

> On Tuesday, 15 September 2020 06:11:48 UTC+1, Nick wrote:
>>
>> Keeping cardinality explosion in mind, what's a decent maximum number of 
>> exported metrics that can be considered performant for scraping and 
>> time-series processing?
>
>
> It depends on how much resource you're prepared to throw at it.  If you 
> build a powerful prometheus server then a total of 2 million timeseries is 
> doable; beyond that you ought to look at sharding across multiple servers.
>
> As I mainly need the counter total, I can split the web analytics to 
>> reduce the number of possible label combinations, for example:
>>
>> { domain, page }
>> { domain, browser }
>>
>
> Yes that's fine, but you still want to limit the number of values for each 
> label.  As Stuart said: in the case of browser you don't want the raw 
> User-Agent header, but pick between a small pre-determined set of values 
> like "Firefox", "Chrome", "IE", "Other".  In the case of "page" strip out 
> any query string, and ideally also limit to known pages or "Other".
>
> If you also want to record the raw User-Agent values for every request 
> then do that in a separate logging system (e.g. loki, elasticsearch, a SQL 
> database, or even just plain text files)
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/ad68efaa-63c9-46f6-8a8a-7852ad1373d0n%40googlegroups.com.

Re: [prometheus-users] Prettifying and simplifying metrics/visualizations

2020-09-15 Thread Tim Schwenke

You can also "preaggregate" with recording rules. Though note that it is 
not possible to do that with counter type time series* while also* keeping 
them counters.

Christian Hoffmann schrieb am Dienstag, 15. September 2020 um 11:14:51 
UTC+2:

> On 9/15/20 10:55 AM, John Dexter wrote:
> > I'm still finding my feet with Prometheus and one thing that is a bit
> > awkward is that time-series names are pretty cumbersome. We want a
> > customer-facing dashboard so let's say I want to monitor network 
> activity:
> > 
> > rate(windows_net_packets_total[2m])
> > 
> > What is displayed is:
> > 
> > {instance="localhost:9182",job="node",nic="Local_Area_Connection__11"} 0
> > {instance="localhost:9182",job="node",nic="isatap__..._"} 0
> > {instance="localhost:9182",job="node",nic="vmxnet3_Ethernet_Adapter__2"}
> > 14.411582607039099
> > 
> > If I push into Grafana I get the 3 time-series displayed and the sort of
> > issues I face are:
> > 
> > * instance=..., job=... is pretty verbose, I wish it just said
> > 'localhost'. Is this possible somehow?
> Yes, Grafana handles this. You can use Grafana templates in the Legend
> field (e.g. {{instance}})¹.
>
> If you want to get rid of the port as well, you might want to look into
> modifying your instance label when scraping.
>
> Example:
> https://www.robustperception.io/controlling-the-instance-label
>
> > * I just want one time-series per machine, and I don't really want to
> > have to hard-code nic-name in my YAML. Does PromQL let me
> > aggregate over a specific label?
> Yes, sure. Just choose the kind of aggregation, e.g.
>
> sum by (instance) (... your query ...)
>
>
> https://prometheus.io/docs/prometheus/latest/querying/operators/#aggregation-operators
>
> Kind regards,
> Christian
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/75c055b0-86c6-4ba8-a693-86e72946a1f7n%40googlegroups.com.

[prometheus-users] Prometheus metrics based autoscaling apart from default HPA scaling in kubernetes

2020-09-15 Thread dineshnithy...@gmail.com

Hi Team

How do we achieve prometheus metrics based auto-scaling in kubernetes 
workloads and any best practices or pointers would be highly helpful

Regards
Dinesh

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/bb017bc2-92b1-4662-835d-c4850c4a3b6fn%40googlegroups.com.

[prometheus-users] Unable to get the example remote_storage_adapter to work to send to opentsdb

2020-09-15 Thread Brett

I'm trying to get the example 

 
storage adapter for opentsdb to work. I have prometheus sending data to an 
AWS instance running the remote_storage_adapter binary, and also running an 
opentsdb server. 

I'm running opentsdb with this:
docker run -d -p 4242:4242 -v 
$(pwd)/opentsdb.conf:/etc/opentsdb/opentsdb.conf petergrace/opentsdb-docker

I'm running the remote storage adapter with this:
./remote_storage_adapter --web.listen-address=":80" 
--opentsdb-url="http://localhost:4242/; --log.level=debug



The config file is:

tsd.network.port = 4242
tsd.http.staticroot = /usr/local/share/opentsdb/static/
tsd.http.cachedir = /tmp/opentsdb
tsd.core.plugin_path = /opentsdb-plugins
tsd.core.auto_create_metrics = true
tsd.http.request.enable_chunked = true
tsd.http.request.max_chunk = 65535


I'm not getting anything in opentsdb so I looked at the 
remote_storage_adapter logs, and I see this error repeated:

level=warn ts=2020-09-15T16:53:56.596Z caller=main.go:330 msg="Error 
sending samples to remote storage" err="json: cannot unmarshal object into 
Go value of type int" storage=opentsdb num_samples=100

I did a packet capture and it looks like I'm getting a response from 
opentsdb 

HTTP/1.1 400 Bad Request



Does anyone have any ideas?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/7ba27d7c-52a9-4a0a-8d17-0e6191183192n%40googlegroups.com.

Re: [prometheus-users] Prometheus disk I/O metrics

2020-09-15 Thread rsch...@gmail.com

Corrected the above expression by fixing node_disk_write_read_seconds_total 
to node_disk_write_time_seconds_total:-
(rate( node_disk_read_time_seconds_total [5m]) + 
rate(node_disk_write_time_seconds_total[5m]))/(rate(node_disk_reads_completed_total[5m])
 
+ rate(node_disk_writes_completed_total[5m])) > 

On Tuesday, September 15, 2020 at 11:31:43 AM UTC-7 rsch...@gmail.com wrote:

> Thanks for quick response. Few clarifications:-
> 1) Is below calculation is right to get equivalent of "system.io.await" ?
> (rate( node_disk_read_time_seconds_total [5m]) + 
> rate(node_disk_read_time_seconds_total[5m]))/(rate(node_disk_reads_completed_total[5m])
>  
> + rate(node_disk_writes_completed_total[5m])) > 
>
> 2) If prometheus server has to perform this calculation let say every 5 
> minutes, will this cause CPU load if there are many such alerts for 
> different environments firing at the same time? Is there any better 
> alternative to this?
>
>
>
> On Monday, September 14, 2020 at 10:24:56 PM UTC-7 wp...@pobox.com wrote:
>
>> You can calculate it from the basic IO metrics Prometheus provided.
>>
>> Regards.
>>
>>
>> rsch...@gmail.com wrote:
>>
>> In datadog I used metrics "system.io.await" to create alert on my linux 
>> instances. What is the equivalent metrics in prometheus? 
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/1f8732de-c212-4029-903e-a8c23fd4835en%40googlegroups.com.

Re: [prometheus-users] Prometheus disk I/O metrics

2020-09-15 Thread rsch...@gmail.com

Thanks for quick response. Few clarifications:-
1) Is below calculation is right to get equivalent of "system.io.await" ?
(rate( node_disk_read_time_seconds_total [5m]) + 
rate(node_disk_read_time_seconds_total[5m]))/(rate(node_disk_reads_completed_total[5m])

+ rate(node_disk_writes_completed_total[5m])) > 

2) If prometheus server has to perform this calculation let say every 5 
minutes, will this cause CPU load if there are many such alerts for 
different environments firing at the same time? Is there any better 
alternative to this?

On Monday, September 14, 2020 at 10:24:56 PM UTC-7 wp...@pobox.com wrote:

> You can calculate it from the basic IO metrics Prometheus provided.
>
> Regards.
>
>
> rsch...@gmail.com wrote:
>
> In datadog I used metrics "system.io.await" to create alert on my linux 
> instances. What is the equivalent metrics in prometheus? 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/2269ca61-9c45-4392-a61b-0169d0a0a795n%40googlegroups.com.

[prometheus-users] Re: mapping ip address to host name

2020-09-15 Thread Johny

Yes sorry I meant labels. e.g. metric_host="193.44"

If you're talking about labels, then you could:
- modify the exporter to work in the way you want it to (for example, add a 
new label saying what data centre it is running in)
> its a third party exporter and its not feasible to modify it.
- write a small proxy which updates the labels on the fly during scrapes
> still how do I map that ip address to host
- use metric relabeling to map a few specific IP addresses to names
> there are many hosts and ips are dynamic in nature, static mapping wont 
work.
- generate some static timeseries which map a few specific IPs to specific 
names, and join on them in queries.

Does prometheus provide a built in label with ip address during scraping? 
The scrape config contains host name and port.


On Tuesday, September 15, 2020 at 12:02:00 PM UTC-4 b.ca...@pobox.com wrote:

> Do you really mean they have IP addresses in *values*, or do you mean in 
> *labels* ?  A value is a float64 number; it would be possible to put the 32 
> bits of an IPv4 address in there, but it would be weird.
>
> If you're talking about labels, then you could:
> - modify the exporter to work in the way you want it to (for example, add 
> a new label saying what data centre it is running in)
> - write a small proxy which updates the labels on the fly during scrapes
> - use metric relabeling to map a few specific IP addresses to names
> - generate some static timeseries which map a few specific IPs to specific 
> names, and join on them in queries.
>
> The last two cases only work if the number of different IP addresses you 
> expect to see is small and known in advance.  For an explanation of the 
> last one see
> https://www.robustperception.io/how-to-have-labels-for-machine-roles
> https://www.robustperception.io/exposing-the-software-version-to-prometheus
>
> https://prometheus.io/docs/prometheus/latest/querying/operators/#many-to-one-and-one-to-many-vector-matches
>
> If the different data centres have different IP ranges, it may be enough 
> to match on the prefix: e.g.
>
> expr: foo{role="master",ip=~"10\.123\..+"} and 
> bar{role="slave",ip=~"10\.123\..+}
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/36248519-1401-4d46-907e-b68d04f160d4n%40googlegroups.com.

[prometheus-users] Re: mapping ip address to host name

2020-09-15 Thread Brian Candler

Do you really mean they have IP addresses in *values*, or do you mean in 
*labels* ?  A value is a float64 number; it would be possible to put the 32 
bits of an IPv4 address in there, but it would be weird.

If you're talking about labels, then you could:
- modify the exporter to work in the way you want it to (for example, add a 
new label saying what data centre it is running in)
- write a small proxy which updates the labels on the fly during scrapes
- use metric relabeling to map a few specific IP addresses to names
- generate some static timeseries which map a few specific IPs to specific 
names, and join on them in queries.

The last two cases only work if the number of different IP addresses you 
expect to see is small and known in advance.  For an explanation of the 
last one see
https://www.robustperception.io/how-to-have-labels-for-machine-roles
https://www.robustperception.io/exposing-the-software-version-to-prometheus
https://prometheus.io/docs/prometheus/latest/querying/operators/#many-to-one-and-one-to-many-vector-matches

If the different data centres have different IP ranges, it may be enough to 
match on the prefix: e.g.

expr: foo{role="master",ip=~"10\.123\..+"} and 
bar{role="slave",ip=~"10\.123\..+}

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/4c51463a-170b-4990-bb83-c7dce64fb914o%40googlegroups.com.

[prometheus-users] mapping ip address to host name

2020-09-15 Thread Johny

I've exporters for some components such as Redis that give IP addresses in 
values. I need to be able to map IP addresses to actual host names for my 
query and alert conditions, e.g. to verify master-slave in Redis are in 
different data centers.

How can I fetch metric in prometheus to map a host IP to host name?

  

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/62e2d24b-af9c-4730-ad26-5e289cb11735n%40googlegroups.com.

Re: [prometheus-users] prometheus delete old data files

2020-09-15 Thread Johny

Great, thanks. I will make this change and verify the behavior.

On Sunday, September 13, 2020 at 3:28:46 AM UTC-4 sup...@gmail.com wrote:

> TSDB blocks are automatically cleaned up, but it does this on the 2 hour 
> block management schedule. Blocks also must be fully expired (maxTime) 
> before they are deleted.
>
> You probably just need to wait for the maxTime on the oldest block to 
> expire. Look in the meta.json in the TSDB block directories.
>
> On Sun, Sep 13, 2020 at 3:35 AM Johny  wrote:
>
>> I am reducing data retention from 20 days to 10 days in my Prometheus 
>> nodes (v. 2.17). When I change *storage.tsdb.retention.time *to 10d and 
>> restart my instances, this does not get delete data older than 10 days. Is 
>> there a command to force cleanup?
>>
>> In general, what is best practice to delete older data in Prometheus? 
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-use...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/e883c687-6534-4562-a775-575e81289e4bn%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/fad03fb5-4d25-4a5e-b9ef-4092a6acd1b6n%40googlegroups.com.

[prometheus-users] Re: Scraping multiple entries of the same metric with different values

2020-09-15 Thread Panem78

Thanks a lot for your answer. This completely made it clear for me. 

On Tuesday, September 15, 2020 at 4:27:45 PM UTC+3 b.ca...@pobox.com wrote:

> Each timeseries has to have a different set of labels.  You've returned 
> the same set of labels three times, so this is the same timeseries repeated 
> three times, and prometheus rejects the additional data as duplicate.
>
> You cannot "back fill" values in Prometheus.  That is, you cannot export 
> points 160@t1, 170@t2, 180@t3 in the same scrape.  You can export value 
> 160, and then on next scrape 170, and then on next scrape 180; and the 
> timestamps assigned to those points by prometheus will be the times that 
> the scrapes took place.
>
> If you need to back-fill historical data, then prometheus is not for you.  
> However VictoriaMetrics might meet your needs, as it's very Prometheus-like 
> (it implements a superset of the PromQL API) but *does* support 
> back-filling in various different formats.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/371157a7-e7b3-485f-bd55-a97b3aa8e632n%40googlegroups.com.

[prometheus-users] Re: Scraping multiple entries of the same metric with different values

2020-09-15 Thread Brian Candler

Each timeseries has to have a different set of labels.  You've returned the 
same set of labels three times, so this is the same timeseries repeated 
three times, and prometheus rejects the additional data as duplicate.

You cannot "back fill" values in Prometheus.  That is, you cannot export 
points 160@t1, 170@t2, 180@t3 in the same scrape.  You can export value 
160, and then on next scrape 170, and then on next scrape 180; and the 
timestamps assigned to those points by prometheus will be the times that 
the scrapes took place.

If you need to back-fill historical data, then prometheus is not for you.  
However VictoriaMetrics might meet your needs, as it's very Prometheus-like 
(it implements a superset of the PromQL API) but *does* support 
back-filling in various different formats.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/c5408978-41db-434a-bee8-c03ab39eb319o%40googlegroups.com.

[prometheus-users] Scraping multiple entries of the same metric with different values

2020-09-15 Thread Panem78



Hello everyone ! I have the following question:
I have created a flask application that upon request in it's /metrics 
endpoint, retrieves specific values (strings) from Redis in the form
*"test_service_metric{service="ui",component="graphs",env="mm"} 180"* ,

concatenates them according to prometheus exposition format (each entry 
separated by newline character and a line at the end), and finally returns 
in the Response a string like the following:



*"test_service_metric{service="ui",component="graphs",env="mm"} 
160test_service_metric{service="ui",component="graphs",env="mm"} 
170test_service_metric{service="ui",component="graphs",env="mm"} 180"*

So what i am actually trying to achieve, is prometheus to retrieve all 
these entries by scraping (which refer to the same metric but with 
different values) and display them in the console/graph. But what actually 
happens, is that only one of them actually shows in Prometheus.

Is there something i am doing wrong ? Or because the scrape time is the 
same for all these entries, all but one are regarded as duplicates and 
removed ?

Thanks in advance and please let me know for any clarifications !

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/f6fba68b-cfa1-4b21-be6b-bb6913c20fc7n%40googlegroups.com.

[prometheus-users] Re: Alert once a day

2020-09-15 Thread Aleksandar Ilic


Thanks a lot for your advice and help. Will try it out.

Best Regards
On Tuesday, September 15, 2020 at 10:36:13 AM UTC+2 b.ca...@pobox.com wrote:

> Well, it depends what you're trying to do.  At the moment, you have
>
>
> - match: 
>
> alertname:Watchdog
>
>   receiver: slack 
>
>
> but that doesn't do anything useful, because the default is also to send 
> to receiver "slack"; I don't know what you're trying to achieve by matching 
> on the alertname here.
>
>
> You could remove it, then you would have:
>
>
> route: 
>
>   group_by:
>
> - alertname 
>
>   receiver: slack 
>
>   repeat_interval: 10m 
>
>   routes: 
>
> - match: 
>
> frequency: daily 
>
>   receiver: slack
>
>   repeat_interval: 24h
>
>
> That would send all alerts to slack with 10 minute repeat interval, except 
> those with "frequency: daily" which go to slack with 24 hour repeat 
> interval.
>
>
> You can combine label matches, e.g.
>
>
> route: 
>
>   group by:
>
> - alertname 
>
>   receiver: slack 
>
>   repeat_interval: 10m 
>
>   routes: 
>
> - match: 
>
> frequency: daily 
>
> alertname: Watchdog
>
>   receiver: foo
>
>   repeat_interval: 24h
>
> - match: 
>
> alertname: Watchdog
>
>   receiver: bar
>
>   repeat_interval: 30m
>
>
> Alerts which are Watchdog *and* tagged daily get sent to receiver "foo" 
> with 24 hour repeat interval.  Alerts which are tagged Watchdog (but not 
> daily) are send to receiver "bar" with 30m repeat interval.  Anything which 
> doesn't match these will use the default receiver "slack" and interval 10m.
>
>
> You can also nest routes:
>
>
> route: 
>
>   group by:
>
> - alertname 
>
>   receiver: slack 
>
>   repeat_interval: 10m 
>
>   routes: 
>
> - match: 
>
> frequency: daily 
>
>   repeat_interval: 24h
>
>   receiver: baz
>
>   routes:
>
> - match:
>
> alertname: Watchdog
>
>   receiver: foo
>
> - match:
>
> alertname: DiskSpace
>
>   receiver: bar
>
>
> That is, if there is a label "frequency: daily" then it uses the set of 
> routes underneath, and if none of them match, it will use the defaults from 
> the parent.
>
>
> But to be honest, I don't think you want to set "frequency: daily" as a 
> label on the alert anyway.  Where you deliver alerts to, and how often, is 
> a property of the alert routing, not the alert itself.  You might have a 
> particular alert which needs to be sent every 10 minutes to slack, but only 
> daily to your manager.  If your policy is "watchdog alerts should only be 
> sent to slack every 24 hours", then you can encode that policy directly in 
> alertmanager, rather than labelling the alert as "daily".
>
>
> This also allows more complex policies like "watchdog alerts from dev 
> systems should only be sent every 24 hours, but watchdog alerts from 
> production systems should be sent every 60 minutes"
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/1d71b54c-a75d-40d2-b6a5-2d3c6a98fb69n%40googlegroups.com.

[prometheus-users] metrics monitor for ruby framework

2020-09-15 Thread timothy pember


Good day,

We use ruby framework such as Rails for web development.
Do you know how we can implement metrics monitor with Prometheus within 
the framework?


Thanks.

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/904306e2-362f-8dd9-925e-09208e704d34%40gmail.com.

[prometheus-users] Re: How to keep metric value unchange after Springboot application restart

2020-09-15 Thread Brian Candler

You could:
- Use an external counter, such as statsd_exporter
- Persist the counters during shutdown, and reload them during restart

You should however note that the *absolute* values of counters are, on the 
whole, meaningless.  If the counter was 1 million yesterday and 1.1 million 
today, that tells you something: you had 0.1 million hits in 24 hours.  But 
if the counter was 2 million yesterday and 2.1 million today, it tells you 
the same thing.

If you want to work out the average rate of visitors over a long period 
(say a month), prometheus can do this for you, even though there are 
counter resets.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/1bcb1ce4-4cdf-4b50-b450-049931cf8b44o%40googlegroups.com.

[prometheus-users] How to keep metric value unchange after Springboot application restart

2020-09-15 Thread Daxiang Li

My SpringBoot application needs to be restarted every few days. I want to 
use the counter indicator to count the growth of the number of visits over 
a long period of time (months). But every time you restart, the counter 
indicator will reset to start from 0.

What can I do to achieve this goal?

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/2059d9f4-4492-41ca-a83a-e3b3e55fc849n%40googlegroups.com.

Re: [prometheus-users] Prettifying and simplifying metrics/visualizations

2020-09-15 Thread Christian Hoffmann

On 9/15/20 10:55 AM, John Dexter wrote:
> I'm still finding my feet with Prometheus and one thing that is a bit
> awkward is that time-series names are pretty cumbersome. We want a
> customer-facing dashboard so let's say I want to monitor network activity:
> 
> rate(windows_net_packets_total[2m])
> 
> What is displayed is:
> 
> {instance="localhost:9182",job="node",nic="Local_Area_Connection__11"} 0
> {instance="localhost:9182",job="node",nic="isatap__..._"} 0
> {instance="localhost:9182",job="node",nic="vmxnet3_Ethernet_Adapter__2"}
> 14.411582607039099
> 
> If I push into Grafana I get the 3 time-series displayed and the sort of
> issues I face are:
> 
>   * instance=..., job=... is pretty verbose, I wish it just said
> 'localhost'. Is this possible somehow?
Yes, Grafana handles this. You can use Grafana templates in the Legend
field (e.g. {{instance}})¹.

If you want to get rid of the port as well, you might want to look into
modifying your instance label when scraping.

Example:
https://www.robustperception.io/controlling-the-instance-label

>   * I just want one time-series per machine, and I don't really want to
> have to hard-code nic-name in my YAML. Does PromQL let me
> aggregate over a specific label?
Yes, sure. Just choose the kind of aggregation, e.g.

sum by (instance) (... your query ...)

https://prometheus.io/docs/prometheus/latest/querying/operators/#aggregation-operators

Kind regards,
Christian

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/1a2ea603-9c79-91ef-37ce-18ff2fdbdf8e%40hoffmann-christian.info.

[prometheus-users] Prettifying and simplifying metrics/visualizations

2020-09-15 Thread John Dexter

I'm still finding my feet with Prometheus and one thing that is a bit
awkward is that time-series names are pretty cumbersome. We want a
customer-facing dashboard so let's say I want to monitor network activity:

rate(windows_net_packets_total[2m])

What is displayed is:

{instance="localhost:9182",job="node",nic="Local_Area_Connection__11"} 0
{instance="localhost:9182",job="node",nic="isatap__..._"} 0
{instance="localhost:9182",job="node",nic="vmxnet3_Ethernet_Adapter__2"}
14.411582607039099

If I push into Grafana I get the 3 time-series displayed and the sort of
issues I face are:

   - instance=..., job=... is pretty verbose, I wish it just said
   'localhost'. Is this possible somehow?
   - I just want one time-series per machine, and I don't really want to
   have to hard-code nic-name in my YAML. Does PromQL let me aggregate over a
   specific label?

Many thanks for any pointers,
John.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAGJRanidqX8i0PGHz%3DQM90Mv7Kjinti5MdKzdp5QQTsRS7xWNA%40mail.gmail.com.

[prometheus-users] Re: Alert once a day

2020-09-15 Thread Brian Candler

Well, it depends what you're trying to do. At the moment, you have

- match:

alertname:Watchdog

receiver: slack

but that doesn't do anything useful, because the default is also to send to
receiver "slack"; I don't know what you're trying to achieve by matching on
the alertname here.

You could remove it, then you would have:

route:

group_by:

- alertname

receiver: slack

repeat_interval: 10m

routes:

- match:

frequency: daily

receiver: slack

repeat_interval: 24h

That would send all alerts to slack with 10 minute repeat interval, except
those with "frequency: daily" which go to slack with 24 hour repeat
interval.

You can combine label matches, e.g.

route:

group by:

- alertname

receiver: slack

repeat_interval: 10m

routes:

- match:

frequency: daily

alertname: Watchdog

receiver: foo

repeat_interval: 24h

- match:

alertname: Watchdog

receiver: bar

repeat_interval: 30m

Alerts which are Watchdog *and* tagged daily get sent to receiver "foo"
with 24 hour repeat interval. Alerts which are tagged Watchdog (but not
daily) are send to receiver "bar" with 30m repeat interval. Anything which
doesn't match these will use the default receiver "slack" and interval 10m.

You can also nest routes:

route:

group by:

- alertname

receiver: slack

repeat_interval: 10m

routes:

- match:

frequency: daily

repeat_interval: 24h

receiver: baz

routes:

- match:

alertname: Watchdog

receiver: foo

- match:

alertname: DiskSpace

receiver: bar

That is, if there is a label "frequency: daily" then it uses the set of
routes underneath, and if none of them match, it will use the defaults from
the parent.

But to be honest, I don't think you want to set "frequency: daily" as a
label on the alert anyway. Where you deliver alerts to, and how often, is
a property of the alert routing, not the alert itself. You might have a
particular alert which needs to be sent every 10 minutes to slack, but only
daily to your manager. If your policy is "watchdog alerts should only be
sent to slack every 24 hours", then you can encode that policy directly in
alertmanager, rather than labelling the alert as "daily".

This also allows more complex policies like "watchdog alerts from dev
systems should only be sent every 24 hours, but watchdog alerts from
production systems should be sent every 60 minutes"

--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/8912c7fc-f091-4a95-bb8e-cb7781ae3fa8o%40googlegroups.com.

Re: [prometheus-users] Prometheus disk I/O metrics

2020-09-15 Thread Brian Candler

Sorry, I have no idea what metrics Nginx exports or what Lua scripts in 
Nginx can do.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/3f6ae14e-dd75-4d77-9779-663e7eaa904do%40googlegroups.com.

Re: [prometheus-users] Prometheus disk I/O metrics

2020-09-15 Thread Wesley Peng


Brian,

Do you know if we can implement a Lua exporter within nginx who take 
application's APM and report to prometheus?


Thank you.


Brian Candler wrote:
Just to add, the data collected by node_exporter maps closely to the raw 
stats exposed by the kernel, so the kernel documentation is helpful:


--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/2e0ca03a-4747-f214-d693-2e54996f01c8%40pobox.com.

[prometheus-users] Re: Metric type for basic web analytics

2020-09-15 Thread Brian Candler

On Tuesday, 15 September 2020 06:11:48 UTC+1, Nick wrote:
>
> Keeping cardinality explosion in mind, what's a decent maximum number of 
> exported metrics that can be considered performant for scraping and 
> time-series processing?

It depends on how much resource you're prepared to throw at it.  If you 
build a powerful prometheus server then a total of 2 million timeseries is 
doable; beyond that you ought to look at sharding across multiple servers.

As I mainly need the counter total, I can split the web analytics to reduce 
> the number of possible label combinations, for example:
>
> { domain, page }
> { domain, browser }
>

Yes that's fine, but you still want to limit the number of values for each 
label.  As Stuart said: in the case of browser you don't want the raw 
User-Agent header, but pick between a small pre-determined set of values 
like "Firefox", "Chrome", "IE", "Other".  In the case of "page" strip out 
any query string, and ideally also limit to known pages or "Other".

If you also want to record the raw User-Agent values for every request then 
do that in a separate logging system (e.g. loki, elasticsearch, a SQL 
database, or even just plain text files)

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/1dd2911d-2347-4e21-928c-ce05ffc93a4ao%40googlegroups.com.

[prometheus-users] Re: Alert once a day

2020-09-15 Thread Aleksandar Ilic

Hello,

Only my daily alert has both daily and watchdog rest of the alerts have 
other tags.Guess my understanding that if they both match they wouldn’t 
take the default value.

Is there any other way i could my write the alert so i could get the needed 
results?

Best Regards

On Tuesday, September 15, 2020 at 9:17:56 AM UTC+2 b.ca...@pobox.com wrote:

> What labels does your test alert have?
>
> The first rule which matches, wins(*).  So if your alert has both 
> "frequency: daily" and "alertname: Watchdog" labels then it will match the 
> first route, and inherit the default repeat_interval of 10m.
>
> (*) Unless you set "continue: true", but then the result could be sent to 
> multiple receivers at the same level.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/35c33674-b291-4cc6-8a48-2f29743d94a1n%40googlegroups.com.

[prometheus-users] Re: expose data from Prometheus

2020-09-15 Thread Brian Candler

That is a k8s question, not a prometheus question.  In short: Ingress 
controllers 
 
are how you'd expose *any* HTTP(S) service running in your k8s cluster to 
the outside world.  Examples include Traefik, Nginx, HAProxy, MetalLB.  Use 
what you're already using to expose other services.

I notice you say "we are able to see the data in prometehus [sic] UI".  The 
API and the UI are the same thing (on port 9090 by default), so if you can 
see one, you can see the other.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/9e45c586-4b5b-45d1-b5cc-b94f37f5dd21o%40googlegroups.com.

Re: [prometheus-users] Prometheus disk I/O metrics

2020-09-15 Thread Brian Candler

Just to add, the data collected by node_exporter maps closely to the raw 
stats exposed by the kernel, so the kernel documentation is helpful:
https://www.kernel.org/doc/Documentation/iostats.txt
https://www.kernel.org/doc/html/latest/admin-guide/iostats.html

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/9a928530-978d-4e1c-9701-acee10c1bcbao%40googlegroups.com.

[prometheus-users] Re: How to capture multiple values in one metric

2020-09-15 Thread Brian Candler

Can you given some examples of the type of query you want to do?

If they of the form "what's the latency that 95% of requests complete 
within?" then you could use a "summary" instead of "histogram".  See:
https://prometheus.io/docs/practices/histograms/

However if you genuinely want to "capture back end latency *without 
aggregating*", then you shouldn't use prometheus at all - what you want is 
a logging system which captures all the individual events (e.g. loki, 
elasticsearch etc)

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/894176c2-8c40-4f1d-876d-da19155fe1efo%40googlegroups.com.

[prometheus-users] Re: Alert once a day

2020-09-15 Thread Brian Candler

What labels does your test alert have?

The first rule which matches, wins(*).  So if your alert has both 
"frequency: daily" and "alertname: Watchdog" labels then it will match the 
first route, and inherit the default repeat_interval of 10m.

(*) Unless you set "continue: true", but then the result could be sent to 
multiple receivers at the same level.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/542b082e-3ad7-498b-83e9-d274658eb43ao%40googlegroups.com.

Re: [prometheus-users] Re: Metric type for basic web analytics

2020-09-15 Thread Stuart Clark


On 15/09/2020 06:11, Nick wrote:
Keeping cardinality explosion in mind, what's a decent maximum number 
of exported metrics that can be considered performant for scraping and 
time-series processing?


As I mainly need the counter total, I can split the web analytics to 
reduce the number of possible label combinations, for example:


{ domain, page }
{ domain, browser }



For both of those some level of pre-processing would be advised to 
reduce cardinality.


For page, remove query strings, etc. (especially as they can contain 
unique tracking IDs). For the browser convert the user agent header into 
a general string such as Firefox, Chrome, Opera, etc.



--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/f14045f7-7a54-ceb6-ab52-053e1bb8f90c%40Jahingo.com.

Re: [prometheus-users] Prometheus disconnected data recovery

2020-09-15 Thread Stuart Clark


On 15/09/2020 04:16, tiecheng shen wrote:
Hello, I am a newbie to prometheus. I have a requirement. When the 
prometheus server and the captured client are disconnected from the 
network, prometheus cannot capture the data when the network is 
disconnected, and the graph will be disconnected when the network is 
reconnected. For one thing, if I store the data when I was 
disconnected locally, can I recover those data? If not, is there any 
third-party way to support this?


The recommendation is to site Prometheus servers within the same failure 
domains as the services you are scraping. Can a Prometheus server be 
installed next to this remote client?



--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/27a97cd9-0b7a-0ab4-9498-66d998d00956%40Jahingo.com.

Re: [prometheus-users] Re: Prometheus.service status failed

2020-09-15 Thread Suryaprakash Kancharlapalli

Thank you Brian, version 2.21 worked for me

On Mon, Sep 14, 2020, 8:47 PM Brian Candler  wrote:

> The article looks fine, it's just very old.  Replace 2.3.2 with latest
> version 2.21.0 from https://github.com/prometheus/prometheus/releases
>
> One of the comments says that using multiple targets under a single
> static_configs section is problematic.  That seems weird, but to rule it
> out you could try:
>
> scrape_configs:
>   - job_name: 'prometheus'
> scrape_interval: 5s
> static_configs:
>   - targets: ['localhost:9090','localhost:9100']
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/bee4449a-9457-4296-b898-8f7dc950c61eo%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAM2-6TrNO1H3LgyYdE0TeaiemtsRpu07Ob7VY8nT2xgiu4KVkQ%40mail.gmail.com.

[prometheus-users] Re: 1st service down alert repeating when 2nd service down after few minutes

2020-09-15 Thread Sandeep Rao Kokkirala

Thanks  Brian . it's working  

On Monday, September 14, 2020 at 7:24:31 PM UTC+8 b.ca...@pobox.com wrote:

> On Monday, 14 September 2020 11:38:43 UTC+1, Sandeep Rao Kokkirala wrote:
>>
>> consider 1st service is down ..our alertmanager is triggers  the alert 
>> ...when 2nd service is down after 10 minutes ..that time it triggering both 
>> 1st service and 2nd service alerts ...1st service alert already triggered 
>> so  we don't want 1st service alert to repeat.
>>
>
> It's sending a new *grouped* alert which contains both the original alert 
> and the 2nd alert, grouped together.  You asked for this with   group_by: 
> ['alertname']   which means that two alerts with the same 'alertname' 
> label should be considered part of the same group.
>
> You can disable grouping entirely with:group_by: ['...']  (yes, 
> that's literally three dots in there)
>
> https://prometheus.io/docs/alerting/latest/configuration/#route
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/68d11f46-ce6c-4bb4-a1a2-56d1b34f4755n%40googlegroups.com.

37 matches

Mail list logo