[prometheus-users] Re: accumulating counter metric that are never incremented since restart

2023-05-03 Thread Johny
Thanks Brian. I was able to use .withLabels(...).Add(0) in the 
beginning to export the 0 value.

On Wednesday, May 3, 2023 at 4:55:03 PM UTC-4 Brian Candler wrote:

> Confirmed. Take the "Basic Example" from here:
>
> https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#hdr-A_Basic_Example
>
> Remove the .Inc() from the "m.hdFailures.With" line
>
> $ curl -fsS localhost:8080/metrics | grep hd_errors
> # HELP hd_errors_total Number of hard-disk errors.
> # TYPE hd_errors_total counter
> hd_errors_total{device="/dev/sda"} 0
>
> On Wednesday, 3 May 2023 at 21:51:53 UTC+1 Brian Candler wrote:
>
>> Is it a CounterVec you're using?
>>
>> If so, I think that v.With(labels...) should be sufficient to initialise 
>> it.
>>
>> On Wednesday, 3 May 2023 at 20:52:57 UTC+1 Johny wrote:
>>
>>> Yes golang. I am using promauto to auto register my metrics with the 
>>> default registry at the time of initialization.
>>>
>>> https://pkg.go.dev/github.com/prometheus/client...@v1.15.1/prometheus/promauto
>>>  
>>> 
>>>
>>> Do you recall a way to force export it always with a default initial 
>>> value of 0? 
>>>
>>> On Wednesday, May 3, 2023 at 3:33:38 PM UTC-4 Brian Candler wrote:
>>>
 Yes, you should be able to create a counter which publishes its initial 
 value of zero. I'm fairly sure I've done this with the Golang client some 
 time in the past. What language and client library are you using?

 You'll have to initialise the counters explicitly. If the first time 
 the client library knows about the counter is when you increment it, then 
 clearly it won't be able to export it until then.

 On Wednesday, 3 May 2023 at 20:19:27 UTC+1 Johny wrote:

> I need to sum two separate counter metrics capturing request failures  
> to compute ratio of error requests for an alerting signal. The code 
> initializing and setting these counters sits in separate modules 
> preventing 
> reuse of one counter.
>
> The problem is when one of the counter is never incremented after a 
> restart, service never exports the data point, prometheus will never get 
> the time series and the summation will return nothing. 
>
> Is there a way to "force" publish a counter to 0 always on service 
> reboot during counter initialization to avoid this problem?
>
>  (fail_count1 + fail_count2) / (total_count1 + total_count2)
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/55e76e23-6a34-4aca-b81b-90a1366fb2cen%40googlegroups.com.


[prometheus-users] Re: accumulating counter metric that are never incremented since restart

2023-05-03 Thread Brian Candler
Confirmed. Take the "Basic Example" from here:
https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#hdr-A_Basic_Example

Remove the .Inc() from the "m.hdFailures.With" line

$ curl -fsS localhost:8080/metrics | grep hd_errors
# HELP hd_errors_total Number of hard-disk errors.
# TYPE hd_errors_total counter
hd_errors_total{device="/dev/sda"} 0

On Wednesday, 3 May 2023 at 21:51:53 UTC+1 Brian Candler wrote:

> Is it a CounterVec you're using?
>
> If so, I think that v.With(labels...) should be sufficient to initialise 
> it.
>
> On Wednesday, 3 May 2023 at 20:52:57 UTC+1 Johny wrote:
>
>> Yes golang. I am using promauto to auto register my metrics with the 
>> default registry at the time of initialization.
>>
>> https://pkg.go.dev/github.com/prometheus/client...@v1.15.1/prometheus/promauto
>>  
>> 
>>
>> Do you recall a way to force export it always with a default initial 
>> value of 0? 
>>
>> On Wednesday, May 3, 2023 at 3:33:38 PM UTC-4 Brian Candler wrote:
>>
>>> Yes, you should be able to create a counter which publishes its initial 
>>> value of zero. I'm fairly sure I've done this with the Golang client some 
>>> time in the past. What language and client library are you using?
>>>
>>> You'll have to initialise the counters explicitly. If the first time the 
>>> client library knows about the counter is when you increment it, then 
>>> clearly it won't be able to export it until then.
>>>
>>> On Wednesday, 3 May 2023 at 20:19:27 UTC+1 Johny wrote:
>>>
 I need to sum two separate counter metrics capturing request failures  
 to compute ratio of error requests for an alerting signal. The code 
 initializing and setting these counters sits in separate modules 
 preventing 
 reuse of one counter.

 The problem is when one of the counter is never incremented after a 
 restart, service never exports the data point, prometheus will never get 
 the time series and the summation will return nothing. 

 Is there a way to "force" publish a counter to 0 always on service 
 reboot during counter initialization to avoid this problem?

  (fail_count1 + fail_count2) / (total_count1 + total_count2)



-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/20c83f95-2bc3-4cbc-b1c1-a24f096a83f2n%40googlegroups.com.


[prometheus-users] Re: accumulating counter metric that are never incremented since restart

2023-05-03 Thread Brian Candler
Is it a CounterVec you're using?

If so, I think that v.With(labels...) should be sufficient to initialise it.

On Wednesday, 3 May 2023 at 20:52:57 UTC+1 Johny wrote:

> Yes golang. I am using promauto to auto register my metrics with the 
> default registry at the time of initialization.
>
> https://pkg.go.dev/github.com/prometheus/client...@v1.15.1/prometheus/promauto
>  
> 
>
> Do you recall a way to force export it always with a default initial value 
> of 0? 
>
> On Wednesday, May 3, 2023 at 3:33:38 PM UTC-4 Brian Candler wrote:
>
>> Yes, you should be able to create a counter which publishes its initial 
>> value of zero. I'm fairly sure I've done this with the Golang client some 
>> time in the past. What language and client library are you using?
>>
>> You'll have to initialise the counters explicitly. If the first time the 
>> client library knows about the counter is when you increment it, then 
>> clearly it won't be able to export it until then.
>>
>> On Wednesday, 3 May 2023 at 20:19:27 UTC+1 Johny wrote:
>>
>>> I need to sum two separate counter metrics capturing request failures  
>>> to compute ratio of error requests for an alerting signal. The code 
>>> initializing and setting these counters sits in separate modules preventing 
>>> reuse of one counter.
>>>
>>> The problem is when one of the counter is never incremented after a 
>>> restart, service never exports the data point, prometheus will never get 
>>> the time series and the summation will return nothing. 
>>>
>>> Is there a way to "force" publish a counter to 0 always on service 
>>> reboot during counter initialization to avoid this problem?
>>>
>>>  (fail_count1 + fail_count2) / (total_count1 + total_count2)
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/ad84a4e8-3aa4-476a-9b18-edc11ebd1ac5n%40googlegroups.com.


[prometheus-users] Re: accumulating counter metric that are never incremented since restart

2023-05-03 Thread Johny
Yes golang. I am using promauto to auto register my metrics with the 
default registry at the time of initialization.
https://pkg.go.dev/github.com/prometheus/client_golang@v1.15.1/prometheus/promauto

Do you recall a way to force export it always with a default initial value 
of 0? 

On Wednesday, May 3, 2023 at 3:33:38 PM UTC-4 Brian Candler wrote:

> Yes, you should be able to create a counter which publishes its initial 
> value of zero. I'm fairly sure I've done this with the Golang client some 
> time in the past. What language and client library are you using?
>
> You'll have to initialise the counters explicitly. If the first time the 
> client library knows about the counter is when you increment it, then 
> clearly it won't be able to export it until then.
>
> On Wednesday, 3 May 2023 at 20:19:27 UTC+1 Johny wrote:
>
>> I need to sum two separate counter metrics capturing request failures  to 
>> compute ratio of error requests for an alerting signal. The code 
>> initializing and setting these counters sits in separate modules preventing 
>> reuse of one counter.
>>
>> The problem is when one of the counter is never incremented after a 
>> restart, service never exports the data point, prometheus will never get 
>> the time series and the summation will return nothing. 
>>
>> Is there a way to "force" publish a counter to 0 always on service reboot 
>> during counter initialization to avoid this problem?
>>
>>  (fail_count1 + fail_count2) / (total_count1 + total_count2)
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/459e259b-7c37-4b23-86fd-271e0852556fn%40googlegroups.com.


[prometheus-users] Re: accumulating counter metric that are never incremented since restart

2023-05-03 Thread Brian Candler
Yes, you should be able to create a counter which publishes its initial 
value of zero. I'm fairly sure I've done this with the Golang client some 
time in the past. What language and client library are you using?

You'll have to initialise the counters explicitly. If the first time the 
client library knows about the counter is when you increment it, then 
clearly it won't be able to export it until then.

On Wednesday, 3 May 2023 at 20:19:27 UTC+1 Johny wrote:

> I need to sum two separate counter metrics capturing request failures  to 
> compute ratio of error requests for an alerting signal. The code 
> initializing and setting these counters sits in separate modules preventing 
> reuse of one counter.
>
> The problem is when one of the counter is never incremented after a 
> restart, service never exports the data point, prometheus will never get 
> the time series and the summation will return nothing. 
>
> Is there a way to "force" publish a counter to 0 always on service reboot 
> during counter initialization to avoid this problem?
>
>  (fail_count1 + fail_count2) / (total_count1 + total_count2)
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8c36cde7-f503-48b9-9907-ecc9b8672afdn%40googlegroups.com.


[prometheus-users] accumulating counter metric that are never incremented since restart

2023-05-03 Thread Johny
I need to sum two separate counter metrics capturing request failures  to 
compute ratio of error requests for an alerting signal. The code 
initializing and setting these counters sits in separate modules preventing 
reuse of one counter.

The problem is when one of the counter is never incremented after a 
restart, service never exports the data point, prometheus will never get 
the time series and the summation will return nothing. 

Is there a way to "force" publish a counter to 0 always on service reboot 
during counter initialization to avoid this problem?

 (fail_count1 + fail_count2) / (total_count1 + total_count2)

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/36b0cfd6-5796-4a0c-8363-867dbbd21176n%40googlegroups.com.


[prometheus-users] Re: Unable to run multiple queries in single one

2023-05-03 Thread Brian Candler
Sorry, ignore that. You want one if it's present, *or* the other.

On Wednesday, 3 May 2023 at 18:16:01 UTC+1 Brian Candler wrote:

> Another approach is to turn it around using the "unless" operator. This 
> will *only* give timeseries from the vector on the LHS, and will suppress 
> any which have matching label sets on the RHS.
>
> sum by (nodename) () unless on (nodename) machine_cpu_cores
>
> [assuming that machine_cpu_cores has a "nodename" label]
>
> On Wednesday, 3 May 2023 at 17:18:37 UTC+1 Brian Candler wrote:
>
>> I think you need to describe:
>> * what you actually see
>> * what you would like to see instead
>> * the results of each of the subexpressions (i.e. left and right of "or") 
>> in the PromQL browser
>>
>> Then it should be clearer how to combine them to achieve the result you 
>> want.
>>
>> For more info on how the "or" operator works, see: 
>> https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators
>>
>> The most important thing to note is that it's not a boolean: it's a 
>> union.  An expression like "a or b" has a vector of values for "a" and a 
>> vector of values for "b". The result combines both timeseries a and b into 
>> the result set.  However, if there are any metrics from "a" and "b" which 
>> match *exactly* the same set of label values, then only the "a" one will be 
>> included in the result set.
>>
>> You can restrict the set of labels used for matching using 
>> "on(labels...)" or "ignoring(labels...)"
>>
>> In your example: the subexpressions "sum by (nodename) (...)" will only 
>> have a single label {nodename="XXX"}, whilst the subexpression 
>> "machine_cpu_cores" very likely has more labels than that (job, instance 
>> etc) and may not have a "nodename" label at all.  Since the labels of the 
>> LHS and RHS of "or" don't match, both sides are included in the result.
>>
>> On Wednesday, 3 May 2023 at 15:19:09 UTC+1 Anuj Kumar wrote:
>>
>>> HI All,
>>>
>>> I am using the below query but getting output for the two queries . I 
>>> need to get the output for each query. Any help would be appreciated.
>>>
>>> machine_cpu_cores or sum by(nodename) 
>>> (irate(node_cpu_seconds_total{mode="idle"}[5m]) * on(instance) 
>>> group_left(nodename) node_uname_info) or sum by(nodename) 
>>> (irate(node_cpu_seconds_total{mode!="idle"}[5m]) * on(instance) 
>>> group_left(nodename) node_uname_info)
>>>
>>> Thanks,
>>> Anuj Kumar
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/5dd1ed7a-3f77-4df9-bd48-151937d5ecb6n%40googlegroups.com.


[prometheus-users] Re: Unable to run multiple queries in single one

2023-05-03 Thread Brian Candler
Another approach is to turn it around using the "unless" operator. This 
will *only* give timeseries from the vector on the LHS, and will suppress 
any which have matching label sets on the RHS.

sum by (nodename) () unless on (nodename) machine_cpu_cores

[assuming that machine_cpu_cores has a "nodename" label]

On Wednesday, 3 May 2023 at 17:18:37 UTC+1 Brian Candler wrote:

> I think you need to describe:
> * what you actually see
> * what you would like to see instead
> * the results of each of the subexpressions (i.e. left and right of "or") 
> in the PromQL browser
>
> Then it should be clearer how to combine them to achieve the result you 
> want.
>
> For more info on how the "or" operator works, see: 
> https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators
>
> The most important thing to note is that it's not a boolean: it's a 
> union.  An expression like "a or b" has a vector of values for "a" and a 
> vector of values for "b". The result combines both timeseries a and b into 
> the result set.  However, if there are any metrics from "a" and "b" which 
> match *exactly* the same set of label values, then only the "a" one will be 
> included in the result set.
>
> You can restrict the set of labels used for matching using "on(labels...)" 
> or "ignoring(labels...)"
>
> In your example: the subexpressions "sum by (nodename) (...)" will only 
> have a single label {nodename="XXX"}, whilst the subexpression 
> "machine_cpu_cores" very likely has more labels than that (job, instance 
> etc) and may not have a "nodename" label at all.  Since the labels of the 
> LHS and RHS of "or" don't match, both sides are included in the result.
>
> On Wednesday, 3 May 2023 at 15:19:09 UTC+1 Anuj Kumar wrote:
>
>> HI All,
>>
>> I am using the below query but getting output for the two queries . I 
>> need to get the output for each query. Any help would be appreciated.
>>
>> machine_cpu_cores or sum by(nodename) 
>> (irate(node_cpu_seconds_total{mode="idle"}[5m]) * on(instance) 
>> group_left(nodename) node_uname_info) or sum by(nodename) 
>> (irate(node_cpu_seconds_total{mode!="idle"}[5m]) * on(instance) 
>> group_left(nodename) node_uname_info)
>>
>> Thanks,
>> Anuj Kumar
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/c0c392b6-906d-4781-aa1d-c40fe1d55f8fn%40googlegroups.com.


[prometheus-users] Re: Unable to run multiple queries in single one

2023-05-03 Thread Brian Candler
I think you need to describe:
* what you actually see
* what you would like to see instead
* the results of each of the subexpressions (i.e. left and right of "or") 
in the PromQL browser

Then it should be clearer how to combine them to achieve the result you 
want.

For more info on how the "or" operator works, 
see: 
https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators

The most important thing to note is that it's not a boolean: it's a union.  
An expression like "a or b" has a vector of values for "a" and a vector of 
values for "b". The result combines both timeseries a and b into the result 
set.  However, if there are any metrics from "a" and "b" which match 
*exactly* the same set of label values, then only the "a" one will be 
included in the result set.

You can restrict the set of labels used for matching using "on(labels...)" 
or "ignoring(labels...)"

In your example: the subexpressions "sum by (nodename) (...)" will only 
have a single label {nodename="XXX"}, whilst the subexpression 
"machine_cpu_cores" very likely has more labels than that (job, instance 
etc) and may not have a "nodename" label at all.  Since the labels of the 
LHS and RHS of "or" don't match, both sides are included in the result.

On Wednesday, 3 May 2023 at 15:19:09 UTC+1 Anuj Kumar wrote:

> HI All,
>
> I am using the below query but getting output for the two queries . I need 
> to get the output for each query. Any help would be appreciated.
>
> machine_cpu_cores or sum by(nodename) 
> (irate(node_cpu_seconds_total{mode="idle"}[5m]) * on(instance) 
> group_left(nodename) node_uname_info) or sum by(nodename) 
> (irate(node_cpu_seconds_total{mode!="idle"}[5m]) * on(instance) 
> group_left(nodename) node_uname_info)
>
> Thanks,
> Anuj Kumar
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8c505178-b54f-45c1-b96d-3188bfb69f9fn%40googlegroups.com.


[prometheus-users] Unable to run multiple queries in single one

2023-05-03 Thread Anuj Kumar
HI All,

I am using the below query but getting output for the two queries . I need 
to get the output for each query. Any help would be appreciated.

machine_cpu_cores or sum by(nodename) 
(irate(node_cpu_seconds_total{mode="idle"}[5m]) * on(instance) 
group_left(nodename) node_uname_info) or sum by(nodename) 
(irate(node_cpu_seconds_total{mode!="idle"}[5m]) * on(instance) 
group_left(nodename) node_uname_info)

Thanks,
Anuj Kumar

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/dcae2664-17b1-4580-9767-653054c3944cn%40googlegroups.com.


[prometheus-users] Re: Hex-STRING instead of DisplayString = trailing NULL

2023-05-03 Thread Daniel Swarbrick
You can override the type of an OID to force snmp_exporter to always handle 
it as e.g. DisplayString in your generator.yml, even when dodgy snmp 
engines return it as different type than what the MIB specifies. Read the 
"overrides" section of 
https://github.com/prometheus/snmp_exporter/tree/main/generator#file-format

However, that won't prevent the null bytes from still appearing in the 
label value, and the only way (currently) to do that is with metric 
relabelling as you have already discovered. You might want to subscribe to 
this github issue: https://github.com/prometheus/snmp_exporter/issues/615

On Sunday, April 30, 2023 at 11:25:25 AM UTC+2 Jonathan Tougas wrote:

> I'm looking for a way to deal with a situation where we end up with null 
> characters trailing some label values: `count({ifDescr=~".*\x00"}) != 0`.
>
> The source of the problem seems to be with `ifDescr` returned as a 
> `Hex-String` instead of what the MIB says should be a `DisplayString`... 
> for __some__ servers.
>
> # Good,  99% of servers:
> $ snmpget -v 2c -c $creds 172.21.34.10 1.3.6.1.2.1.2.2.1.2.1
> iso.3.6.1.2.1.2.2.1.2.1 = STRING: "eth0"
>
> # Bad, Cisco CVP tsk tsk tsk...
> $ snmpget -v 2c -c $creds 172.20.220.88 1.3.6.1.2.1.2.2.1.2.1
> iso.3.6.1.2.1.2.2.1.2.1 = Hex-STRING: 53 6F 66 74 77 61 72 65 20 4C 6F 6F 
> 70 62 61 63
> 6B 20 49 6E 74 65 72 66 61 63 65 20 31 00
>
> I'm currently planning on using `metric_relabel_configs` to cleanup the 
> trailing nulls on these and other similar situations I uncovered. 
> Is there better way than mopping up like that? Perhaps snmp-exporter can 
> deal with these and convert somehow? I'm not familiar enough with it to 
> figure out if it can or not.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/398861ab-75a7-4e46-800c-21339b038924n%40googlegroups.com.


[prometheus-users] Re: HA Prometheus instances use different amount of storage

2023-05-03 Thread Daniel Swarbrick
This sounds like you might have run into the Go timer jitter bug. Try 
enabling timestamp tolerance to mitigate the effect of timer jitter: 
https://promlabs.com/blog/2021/09/14/whats-new-in-prometheus-2-30/#improving-storage-efficiency-by-tuning-timestamp-tolerances

I have so far not found a satisfactory resolution to this bug, and even 
though enabling timestamp jitter tolerance helps, I still occasionally see 
instances inexplicably using up to 2.5x their previous average bytes per 
sample after a restart. It seems to be a lucky dip. Restarting the instance 
usually settles back down to the original and expected bytes per sample.

On Wednesday, April 12, 2023 at 9:55:10 AM UTC+2 Per Carlson wrote:

> Hi.
>
> We have a pair of Prometheus instances that consume significantly 
> different amounts of storage. The instances are created by the same 
> StatefulSet (created by Prometheus-operator), so they are using the same 
> configuration.
>
> Both instances have similar number of samples and series, but instance 
> "0" consume up to ~50% more storage than instance "1".
>
> $ kubectl exec prometheus-prometheus-0 -- /bin/sh -c "promtool tsdb list ."
> BLOCK ULID  MIN TIME   MAX TIME   DURATION   
> NUM SAMPLES  NUM CHUNKS   NUM SERIES   SIZE
> 01GWY0R4N3QG1QJS957XZ0SYP7  168026400  168032880  18h0m0s   
>  3296299059   26900037 931315   7259935610
> 01GX05E79R6WNQ4F6MMB068WJ7  168032883  168039360  17h59m59.997s 
>  3312300492   27012299 892364   7265602468
> 01GX237SWZYZ6X5XXMENGMQ1YM  168039362  168045840  17h59m59.998s 
>  3315540127   27036907 894595   7247593445
> 01GX410BDAPBMPKP100300C7DW  168045841  168052320  17h59m59.999s 
>  3320458065   27130364 987454   7328750825
> 01GX5YTZVD5W97D497JA11CATT  168052327  168058800  17h59m59.993s 
>  3318443269   27135815 1007206  7380926789
> 01GX7WMF1FNJ6MGT0TJY2A5KEM  168058801  168065280  17h59m59.999s 
>  3331999517   27259726 1028363  7364976990
> 01GX9TDYMKVWJ9WYYY7CD8BCWH  168065285  168071760  17h59m59.995s 
>  3327868238   27186293 981912   7288127305
> 01GXBR7FYPSRWA6N6313MMR9BM  168071769  168078240  17h59m59.991s 
>  3327937718   27125975 896286   7199443835
> 01GXDP01QKKXC137B7JJZ6706W  168078241  168084720  17h59m59.999s 
>  037262   27172805 897459   7194002011
> 01GXFKTGVZN6RXRM74PXB5E61Q  168084721  168091200  17h59m59.999s 
>  3329211104   27134065 879001   7202044230
> 01GXHHM1JST118SYQNX8Z5W8PX  168091204  168097680  17h59m59.996s 
>  3329464442   27131788 876881   7192136400
> 01GXKFCM51YQFDAWTGP6BBXGQZ  168097683  168104160  17h59m59.997s 
>  3329134675   27127804 875877   7197030123
> 01GXND71ZF62MK8M1DP5E7345M  1681041600011  168110640  17h59m59.989s 
>  3327555787   27119184 887763   7216837469
> 01GXQB0QJX2T55FBFFSZC9PC4D  168110645  168117120  17h59m59.995s 
>  3324035858   27084455 871653   7195109123
> 01GXS8T0EJXHH2C1B0CBAEPHQB  1681171200011  168123600  17h59m59.989s 
>  3315573555   26493111 989655   6235040678
> 01GXSXCRNTNDJC359R7160ZEDX  168123601  168125760  5h59m59.999s   
> 1107306526   9028997  828578   1830084344
> 01GXSPFKRCVXHRSD2WAFEFM0WD  168125765  168126480  1h59m59.995s   
> 3697068393015597  808854   671597409
> 01GXSXBRED7JT7WJY9318QYKKZ  168126482  168127200  1h59m59.998s   
> 3696613863012000  805553   668951473
> 01GXT47FQYC712E0M6XPSP41FF  168127201  168127920  1h59m59.999s   
> 3697406283021714  823966   673649781
>
> $ kubectl exec prometheus-prometheus-1 -- /bin/sh -c "promtool tsdb list ."
> BLOCK ULID  MIN TIME   MAX TIME   DURATION   
> NUM SAMPLES  NUM CHUNKS   NUM SERIES   SIZE
> 01GWY0RDK93D2RYJBHJRDMS100  168026400  168032880  18h0m0s   
>  3296396516   26926127 957040   4831014683
> 01GX05ETVBDXMQH0KW9NX7RCPC  168032883  168039360  17h59m59.997s 
>  3312324642   27036260 917296   4807892522
> 01GX2383F7YPDX400MN4DQ9CSX  168039362  168045840  17h59m59.998s 
>  3315587751   27059963 918166   4832761551
> 01GX410PJXX52PKFVH1H205385  168045843  168052320  17h59m59.997s 
>  3320397897   27157090 1014022  4890962085
> 01GX5YVKEAQ6D1NZM1AQW0YJ90  168052323  168058800  17h59m59.997s 
>  3318472581   27171422 1042831  4854062752
> 01GX7WMWV41M3PW3BFV62P0M32  168058801  168065280  17h59m59.999s 
>  3331918609   27288755 1056267  4861196239
> 01GX9TECS126QJM1A1F61GW0ZT  168065283  168071760  17h59m59.997s 
>  3328065112   27214643 1008335  4831609465
> 01GXBR7NZ3RXVSP50V5J2QE4HQ  168071763  168078240  17h59m59.997s 
>  3327954927   27159515 929150   4800273178
> 01GXDP1BCZF7THHQTSK4YGAYX7  168078241  

[prometheus-users] Including volume with error percentage in alert description

2023-05-03 Thread Mike W
Hi all, I'm having trouble with an alert description that relates to a 
failure % of our SMS service Twilio. When the alert fires the description 
currently 
looks like below-

"25.00% of Twilio SMS attempts have failed in the last 5 minutes"

However, we want to include the volume of attempts that come in during the 
5 minute period as well, so we want it to look like below-

"25.00% Twilio SMS attempts have failed in the last 5 minutes
Total SMS attempts: 8"

I believe this is possible to configure in alertmanager, however I have 
been having lots of trouble with this so I am hoping that someone here can 
help. It seems that this person in this forum 
 
is 
attempting something similar, and I have tried similar descriptions but 
both error in the description.


   - *description: '{{ $value | printf "%.2f" }}% of Twilio SMS 
   attempts have failed in the last 5 minutes from API \n Total SMS attempts: 
{{ printf 
"(sum(rate(twilio_text_requests_seconds_count{application=\"api\"}[5m]))) * 
300" | query |humanize }}'*
   - 
*description: '{{ $value | printf "%.2f" }}% of Twilio SMS 
   attempts have failed in the last 5 minutes from API \n Total SMS attempts: 
{{ with query 
"(sum(rate(twilio_text_requests_seconds_count{application='api'}[5m]))) 
   * 300" }} {{ . | first | value | humanize }} {{ end }}' *

Errors for above attempts:

   - : error calling humanize: can't 
   convert template.queryResult to float>
   

If anyone can point me in the right direction for query template functions 
or post a solution it would be very much appreciated!

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/1ce6400b-4d41-4278-84a0-7b9f7a077a4en%40googlegroups.com.


[prometheus-users] Aggregations Wrapper

2023-05-03 Thread Gil P
Hi everyone, I am new to Prometheus. I need to obtain both max and average 
of a query result. I am wondering about the penalty of sending the same 
query again (is there is a caching mechanism similarly to elasticsearch?). 
I was wondering about a wrapper for several aggregations at the same time 
on a series; however, since it does not seem to be present yet, I was 
wondering if it would be bad design as it seems to be far less clean that 
most queries I have seen so far. I'd appreciate any insights!

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/9219e215-5d89-4a00-bb74-b424e3911b5fn%40googlegroups.com.