[prometheus-users] Re: {{ $labels.instance } not picking the instance detials

[email protected] Sun, 12 Dec 2021 13:21:28 -0800

The issue of stripping away labels in queries but then relying on them in 
alert annotations comes up a lot in the teams I work with. It's just so 
easy to overlook it.
That's why I've added a check to pint that will try to warn users when that 
happens, see 
https://github.com/cloudflare/pint/blob/main/docs/CONFIGURATION.md#template.
It won't work in every case, but it when I did run it on our internal repo 
it found tons of cases, mostly with Grafana links that rely on labels to 
populate dashboard variables, so it should be useful.
On Sunday, 12 December 2021 at 11:10:51 UTC Brian Candler wrote:


> Paste the expression in the PromQL browser in the prometheus web 
> interface.  This will show you the results of the expression, including all 
> the labels (switch to graph view to see historical results).  If the result 
> of the PromQL expression doesn't have an instance label, then that won't be 
> available to the alert.
>
> A brief look at your expression suggests that you've intentionally got rid 
> of all the labels.
>
> * sum(foo) gives the total value across all timeseries with metric name 
> "foo".  The result is a single value with no labels (because the result 
> summarises *all* the timeseries given)
> * vector(0) has no labels
>
> If you want instance labels in the result then you're going to have to 
> rewrite your expression.  As a starting point,
>
> sum(foo) by (instance)
>
> will give you a vector of results, each of which has a different instance 
> label. 
>
> I'm not 100% sure what you're trying to do with the "or vector(0)" stuff, 
> but maybe you want something like this:
>
>    expr: increase(blah[3m]) > 0
>
> with the proviso that the resulting value may not be an exact integer - 
> it's the calculated per-second rate, scaled to the time period.  Note that 
> the rate window has to include both the first and last data points of the 
> time period you wish to calculate across: so if you're sampling every 1 
> minute, and you want to calculate the rate using two data points which are 
> 3 minutes apart, then you need blah[4m].  However the result will also be 
> scaled to tell you the estimated increased over 4 minutes, even though it's 
> only using 3 minute's worth of data.  I'm afraid this is an ugly corner of 
> prometheus; more discussion at #3806 
> <https://github.com/prometheus/prometheus/issues/3806>
>
> However, even increase(blah[3m]) will not work with a single data point, 
> or it won't alarm if the first value in the timeseries is non-zero, because 
> it doesn't know for sure that the counter was previously zero.
>
> Maybe this is closer to what you want:
>
>    expr: (blah > 0) unless (blah == blah offset 3m)
>
> However that will give you the value of the counter, not the value of the 
> increase.
>
> HTH, Brian.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/9f6a193f-fc83-4c65-899a-4083778092f0n%40googlegroups.com.

[prometheus-users] Re: {{ $labels.instance } not picking the instance detials

Reply via email to