[prometheus-users] Re: PromQL: multiple queries with dependent values

2022-10-26 Thread marc koser
To close the loop on this, I was able to get this working using this query:

max by (group) (redis_cluster_known_nodes) != count by (group) 
(up{service=~"exporter-redis-.*"})

Thanks for your insight on this Brian.
On Wednesday, October 19, 2022 at 4:24:41 AM UTC-4 Brian Candler wrote:

> Or even:
> redis_cluster_known_nodes != redis_cluster_known_nodes offset 5m
>
> On Tuesday, 18 October 2022 at 20:12:27 UTC+1 marc.k...@gmail.com wrote:
>
>> Perhaps an easier option would be to compare redis_cluster_known_nodes 
>> against what it was n-time_interval_ago:
>> redis_cluster_known_nodes != avg_over_time 
>> (redis_cluster_known_nodes[1d:4h])
>>
>> It's less-than ideal since it's not using a static,expected value of 
>> total cluster nodes and it would match when the cluster nodes become what 
>> is expected but I can deal with that for now.
>>
>> Thanks for your help! 
>>
>> On Tuesday, October 18, 2022 at 7:27:10 AM UTC-4 marc koser wrote:
>>
>>> > So really it boils down to, what's a "node" and how do you count them? 
>>>  Is a single "node" a whole cluster, or is a cluster a collection of nodes?
>>>
>>> A node is a redis service that is part of a cluster (id'ed by the 
>>> `group` label), so a cluster is a collection of nodes. The sum of all nodes 
>>> is a determinate and, under normal circumstances, a static value but since 
>>> a redis 'node' is never forgotten unless told to I want to alert on this 
>>> case since it can skew the interpolation of other metrics.
>>>
>>> > In particular, what do these metrics mean?
>>> > 
>>> > redis_cluster_known_nodes{group="group-a", instance="node-1", 
>>> job="redis-cluster", service="exporter-redis-6379", team="sre"} 10
>>> > redis_cluster_known_nodes{group="group-a", instance="node-2", 
>>> job="redis-cluster", service="exporter-redis-6379"} 11
>>> > redis_cluster_known_nodes{group="group-a", instance="node-3", 
>>> job="redis-cluster", service="exporter-redis-6379"} 16
>>> > redis_cluster_known_nodes{group="group-a", instance="node-4", 
>>> job="redis-cluster", service="exporter-redis-6379"} 16
>>> > redis_cluster_known_nodes{group="group-a", instance="node-5", 
>>> job="redis-cluster", service="exporter-redis-6379"} 16
>>>
>>> This represents the state of all known redis nodes belonging to a single 
>>> cluster relative to a running node.
>>>
>>> > They are all the same "service", but how come instance "node-1" 
>>> contains or sees 10 "nodes", but instance "node-2" contains or sees 11 
>>> "nodes", and the other instances contain or see 16 "nodes"?  Perhaps this 
>>> inconsistency is the error you're trying to detect - in which case, what do 
>>> you think is the correct number of nodes?
>>>
>>> This is indeed the scenario I'm attempting to query for. In this case; 
>>> when a node is joined to the cluster but is unreachable for any reason (ie: 
>>> redis is uninstalled / re-installed and the node rejoins the cluster) the 
>>> node's ID changes (the new ID is valid and reachable, the old ID is no 
>>> longer valid and unreachable).
>>>
>>> The correct value is 10: 5 `instance`'s x 2 `service`'s
>>>
>>> > Let's say 16 is the correct answer for group="group-a" and 
>>> service="exporter-redis-6379".  Perhaps you didn't show the full set of 
>>> "up" metrics.  In which case, I'd first try to build an "up" query which 
>>> gives the expected answer 16 on the right-hand side.  Maybe something like 
>>> this:
>>> >
>>> > count by (service, group) (up{service=~"exporter-redis-.*"})
>>> >
>>> > What does that expression show?
>>>
>>> {group="group-a", service="exporter-redis-6379"} 5
>>> {group="group-a", service="exporter-redis-6380"} 5
>>>
>>> > When you have that part working, then we can work on matching the LHS. 
>>>  Since each *instance* seems to have its own distinct idea of the total 
>>> number of nodes, then I expect this requires an N:1 match on 
>>> (group,service).  That is, there is 1 "should be" value for a given 
>>> (service,group) on the RHS, and multiple nodes each with their own count of 
>>> (service,group) on the LHS.
>>>
>>> That sounds accurate
>>>
>>> > If that's the case, it might end up something like this:
>>> > 
>>> > redis_cluster_known_nodes != on (service, group) group left() 
>>> count by (service, group) (up{service=~"exporter-redis-.*"})
>>> > 
>>> > but at this point I'm just speculating.
>>>
>>> This gives the same result as before.
>>>
>>> I'll keep plugging away at this to see what I can come up with.
>>>
>>> On Tuesday, October 18, 2022 at 3:36:49 AM UTC-4 Brian Candler wrote:
>>>
 Sorry, I missed an underscore there.

redis_cluster_known_nodes != on (service, group) *group_left*() 
 count by (service, group) (up{service=~"exporter-redis-.*"})



-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this

[prometheus-users] Re: PromQL: multiple queries with dependent values

2022-10-19 Thread Brian Candler
Or even:
redis_cluster_known_nodes != redis_cluster_known_nodes offset 5m

On Tuesday, 18 October 2022 at 20:12:27 UTC+1 marc.k...@gmail.com wrote:

> Perhaps an easier option would be to compare redis_cluster_known_nodes 
> against what it was n-time_interval_ago:
> redis_cluster_known_nodes != avg_over_time 
> (redis_cluster_known_nodes[1d:4h])
>
> It's less-than ideal since it's not using a static,expected value of total 
> cluster nodes and it would match when the cluster nodes become what is 
> expected but I can deal with that for now.
>
> Thanks for your help! 
>
> On Tuesday, October 18, 2022 at 7:27:10 AM UTC-4 marc koser wrote:
>
>> > So really it boils down to, what's a "node" and how do you count them? 
>>  Is a single "node" a whole cluster, or is a cluster a collection of nodes?
>>
>> A node is a redis service that is part of a cluster (id'ed by the `group` 
>> label), so a cluster is a collection of nodes. The sum of all nodes is a 
>> determinate and, under normal circumstances, a static value but since a 
>> redis 'node' is never forgotten unless told to I want to alert on this case 
>> since it can skew the interpolation of other metrics.
>>
>> > In particular, what do these metrics mean?
>> > 
>> > redis_cluster_known_nodes{group="group-a", instance="node-1", 
>> job="redis-cluster", service="exporter-redis-6379", team="sre"} 10
>> > redis_cluster_known_nodes{group="group-a", instance="node-2", 
>> job="redis-cluster", service="exporter-redis-6379"} 11
>> > redis_cluster_known_nodes{group="group-a", instance="node-3", 
>> job="redis-cluster", service="exporter-redis-6379"} 16
>> > redis_cluster_known_nodes{group="group-a", instance="node-4", 
>> job="redis-cluster", service="exporter-redis-6379"} 16
>> > redis_cluster_known_nodes{group="group-a", instance="node-5", 
>> job="redis-cluster", service="exporter-redis-6379"} 16
>>
>> This represents the state of all known redis nodes belonging to a single 
>> cluster relative to a running node.
>>
>> > They are all the same "service", but how come instance "node-1" 
>> contains or sees 10 "nodes", but instance "node-2" contains or sees 11 
>> "nodes", and the other instances contain or see 16 "nodes"?  Perhaps this 
>> inconsistency is the error you're trying to detect - in which case, what do 
>> you think is the correct number of nodes?
>>
>> This is indeed the scenario I'm attempting to query for. In this case; 
>> when a node is joined to the cluster but is unreachable for any reason (ie: 
>> redis is uninstalled / re-installed and the node rejoins the cluster) the 
>> node's ID changes (the new ID is valid and reachable, the old ID is no 
>> longer valid and unreachable).
>>
>> The correct value is 10: 5 `instance`'s x 2 `service`'s
>>
>> > Let's say 16 is the correct answer for group="group-a" and 
>> service="exporter-redis-6379".  Perhaps you didn't show the full set of 
>> "up" metrics.  In which case, I'd first try to build an "up" query which 
>> gives the expected answer 16 on the right-hand side.  Maybe something like 
>> this:
>> >
>> > count by (service, group) (up{service=~"exporter-redis-.*"})
>> >
>> > What does that expression show?
>>
>> {group="group-a", service="exporter-redis-6379"} 5
>> {group="group-a", service="exporter-redis-6380"} 5
>>
>> > When you have that part working, then we can work on matching the LHS. 
>>  Since each *instance* seems to have its own distinct idea of the total 
>> number of nodes, then I expect this requires an N:1 match on 
>> (group,service).  That is, there is 1 "should be" value for a given 
>> (service,group) on the RHS, and multiple nodes each with their own count of 
>> (service,group) on the LHS.
>>
>> That sounds accurate
>>
>> > If that's the case, it might end up something like this:
>> > 
>> > redis_cluster_known_nodes != on (service, group) group left() count 
>> by (service, group) (up{service=~"exporter-redis-.*"})
>> > 
>> > but at this point I'm just speculating.
>>
>> This gives the same result as before.
>>
>> I'll keep plugging away at this to see what I can come up with.
>>
>> On Tuesday, October 18, 2022 at 3:36:49 AM UTC-4 Brian Candler wrote:
>>
>>> Sorry, I missed an underscore there.
>>>
>>>redis_cluster_known_nodes != on (service, group) *group_left*() 
>>> count by (service, group) (up{service=~"exporter-redis-.*"})
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/0280feae-1f5f-427f-9000-6803626be449n%40googlegroups.com.


[prometheus-users] Re: PromQL: multiple queries with dependent values

2022-10-18 Thread marc koser
Perhaps an easier option would be to compare redis_cluster_known_nodes 
against what it was n-time_interval_ago:
redis_cluster_known_nodes != avg_over_time 
(redis_cluster_known_nodes[1d:4h])

It's less-than ideal since it's not using a static,expected value of total 
cluster nodes and it would match when the cluster nodes become what is 
expected but I can deal with that for now.

Thanks for your help! 

On Tuesday, October 18, 2022 at 7:27:10 AM UTC-4 marc koser wrote:

> > So really it boils down to, what's a "node" and how do you count them? 
>  Is a single "node" a whole cluster, or is a cluster a collection of nodes?
>
> A node is a redis service that is part of a cluster (id'ed by the `group` 
> label), so a cluster is a collection of nodes. The sum of all nodes is a 
> determinate and, under normal circumstances, a static value but since a 
> redis 'node' is never forgotten unless told to I want to alert on this case 
> since it can skew the interpolation of other metrics.
>
> > In particular, what do these metrics mean?
> > 
> > redis_cluster_known_nodes{group="group-a", instance="node-1", 
> job="redis-cluster", service="exporter-redis-6379", team="sre"} 10
> > redis_cluster_known_nodes{group="group-a", instance="node-2", 
> job="redis-cluster", service="exporter-redis-6379"} 11
> > redis_cluster_known_nodes{group="group-a", instance="node-3", 
> job="redis-cluster", service="exporter-redis-6379"} 16
> > redis_cluster_known_nodes{group="group-a", instance="node-4", 
> job="redis-cluster", service="exporter-redis-6379"} 16
> > redis_cluster_known_nodes{group="group-a", instance="node-5", 
> job="redis-cluster", service="exporter-redis-6379"} 16
>
> This represents the state of all known redis nodes belonging to a single 
> cluster relative to a running node.
>
> > They are all the same "service", but how come instance "node-1" contains 
> or sees 10 "nodes", but instance "node-2" contains or sees 11 "nodes", and 
> the other instances contain or see 16 "nodes"?  Perhaps this inconsistency 
> is the error you're trying to detect - in which case, what do you think is 
> the correct number of nodes?
>
> This is indeed the scenario I'm attempting to query for. In this case; 
> when a node is joined to the cluster but is unreachable for any reason (ie: 
> redis is uninstalled / re-installed and the node rejoins the cluster) the 
> node's ID changes (the new ID is valid and reachable, the old ID is no 
> longer valid and unreachable).
>
> The correct value is 10: 5 `instance`'s x 2 `service`'s
>
> > Let's say 16 is the correct answer for group="group-a" and 
> service="exporter-redis-6379".  Perhaps you didn't show the full set of 
> "up" metrics.  In which case, I'd first try to build an "up" query which 
> gives the expected answer 16 on the right-hand side.  Maybe something like 
> this:
> >
> > count by (service, group) (up{service=~"exporter-redis-.*"})
> >
> > What does that expression show?
>
> {group="group-a", service="exporter-redis-6379"} 5
> {group="group-a", service="exporter-redis-6380"} 5
>
> > When you have that part working, then we can work on matching the LHS. 
>  Since each *instance* seems to have its own distinct idea of the total 
> number of nodes, then I expect this requires an N:1 match on 
> (group,service).  That is, there is 1 "should be" value for a given 
> (service,group) on the RHS, and multiple nodes each with their own count of 
> (service,group) on the LHS.
>
> That sounds accurate
>
> > If that's the case, it might end up something like this:
> > 
> > redis_cluster_known_nodes != on (service, group) group left() count 
> by (service, group) (up{service=~"exporter-redis-.*"})
> > 
> > but at this point I'm just speculating.
>
> This gives the same result as before.
>
> I'll keep plugging away at this to see what I can come up with.
>
> On Tuesday, October 18, 2022 at 3:36:49 AM UTC-4 Brian Candler wrote:
>
>> Sorry, I missed an underscore there.
>>
>>redis_cluster_known_nodes != on (service, group) *group_left*() count 
>> by (service, group) (up{service=~"exporter-redis-.*"})
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/5ae502d0-ed8d-4767-b62f-e90702e643b0n%40googlegroups.com.


[prometheus-users] Re: PromQL: multiple queries with dependent values

2022-10-18 Thread marc koser
> So really it boils down to, what's a "node" and how do you count them? 
 Is a single "node" a whole cluster, or is a cluster a collection of nodes?

A node is a redis service that is part of a cluster (id'ed by the `group` 
label), so a cluster is a collection of nodes. The sum of all nodes is a 
determinate and, under normal circumstances, a static value but since a 
redis 'node' is never forgotten unless told to I want to alert on this case 
since it can skew the interpolation of other metrics.

> In particular, what do these metrics mean?
> 
> redis_cluster_known_nodes{group="group-a", instance="node-1", 
job="redis-cluster", service="exporter-redis-6379", team="sre"} 10
> redis_cluster_known_nodes{group="group-a", instance="node-2", 
job="redis-cluster", service="exporter-redis-6379"} 11
> redis_cluster_known_nodes{group="group-a", instance="node-3", 
job="redis-cluster", service="exporter-redis-6379"} 16
> redis_cluster_known_nodes{group="group-a", instance="node-4", 
job="redis-cluster", service="exporter-redis-6379"} 16
> redis_cluster_known_nodes{group="group-a", instance="node-5", 
job="redis-cluster", service="exporter-redis-6379"} 16

This represents the state of all known redis nodes belonging to a single 
cluster relative to a running node.

> They are all the same "service", but how come instance "node-1" contains 
or sees 10 "nodes", but instance "node-2" contains or sees 11 "nodes", and 
the other instances contain or see 16 "nodes"?  Perhaps this inconsistency 
is the error you're trying to detect - in which case, what do you think is 
the correct number of nodes?

This is indeed the scenario I'm attempting to query for. In this case; when 
a node is joined to the cluster but is unreachable for any reason (ie: 
redis is uninstalled / re-installed and the node rejoins the cluster) the 
node's ID changes (the new ID is valid and reachable, the old ID is no 
longer valid and unreachable).

The correct value is 10: 5 `instance`'s x 2 `service`'s

> Let's say 16 is the correct answer for group="group-a" and 
service="exporter-redis-6379".  Perhaps you didn't show the full set of 
"up" metrics.  In which case, I'd first try to build an "up" query which 
gives the expected answer 16 on the right-hand side.  Maybe something like 
this:
>
> count by (service, group) (up{service=~"exporter-redis-.*"})
>
> What does that expression show?

{group="group-a", service="exporter-redis-6379"} 5
{group="group-a", service="exporter-redis-6380"} 5

> When you have that part working, then we can work on matching the LHS. 
 Since each *instance* seems to have its own distinct idea of the total 
number of nodes, then I expect this requires an N:1 match on 
(group,service).  That is, there is 1 "should be" value for a given 
(service,group) on the RHS, and multiple nodes each with their own count of 
(service,group) on the LHS.

That sounds accurate

> If that's the case, it might end up something like this:
> 
> redis_cluster_known_nodes != on (service, group) group left() count 
by (service, group) (up{service=~"exporter-redis-.*"})
> 
> but at this point I'm just speculating.

This gives the same result as before.

I'll keep plugging away at this to see what I can come up with.

On Tuesday, October 18, 2022 at 3:36:49 AM UTC-4 Brian Candler wrote:

> Sorry, I missed an underscore there.
>
>redis_cluster_known_nodes != on (service, group) *group_left*() count 
> by (service, group) (up{service=~"exporter-redis-.*"})
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/c4e11aaa-30cd-42f5-a902-f866f977d1f9n%40googlegroups.com.


[prometheus-users] Re: PromQL: multiple queries with dependent values

2022-10-18 Thread Brian Candler
Sorry, I missed an underscore there.

   redis_cluster_known_nodes != on (service, group) *group_left*() count by 
(service, group) (up{service=~"exporter-redis-.*"})

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/d17db356-0c1a-4693-99d3-c08fb8579342n%40googlegroups.com.


[prometheus-users] Re: PromQL: multiple queries with dependent values

2022-10-18 Thread Brian Candler
If you run the two halves of the query separately:

redis_cluster_known_nodes

and

count by (instance, service, group) (up{service=~"exporter-redis-.*"})

then I think the reason will become clear.

If that set of "up" metrics is complete, then I'd expect the "count by" 
results for node-1 to be to be

{group="group-a",instance="node-1",service="exporter-redis-6379"} 1
{group="group-a",instance="node-1",service="exporter-redis-6380"} 1

and these values (of 1) are clearly different to

redis_cluster_known_nodes{group="group-a", instance="node-1", 
job="redis-cluster", service="exporter-redis-6379", team="sre"} 10
redis_cluster_known_nodes{group="group-a", instance="node-1", 
job="redis-cluster", service="exporter-redis-6380", team="sre"} 10

Aside: the "count by" seems superfluous here, since every "up" metric has a 
distinct combination of (instance,service,group).  I guess it ensures that 
up values of 0 are turned into 1.

Without knowing more about what you're trying to do and what these metrics 
represent, I can't really help.  A value of redis_cluster_known_nodes of 10 
suggests there are 10 "nodes" of some sort, whatever they are.  But the 
"up" metric will only be 1 or 0 (success or fail on scrape).  If you had a 
separate scrape target for each node then you could count or sum these to 
get the number of nodes, but the list of "up" metrics you showed suggests 
there's only one scrape job for each instance+service combination.

So really it boils down to, what's a "node" and how do you count them?  Is 
a single "node" a whole cluster, or is a cluster a collection of nodes?

In particular, what do these metrics mean?

redis_cluster_known_nodes{group="group-a", instance="node-1", 
job="redis-cluster", service="exporter-redis-6379", team="sre"} 10
redis_cluster_known_nodes{group="group-a", instance="node-2", 
job="redis-cluster", service="exporter-redis-6379"} 11
redis_cluster_known_nodes{group="group-a", instance="node-3", 
job="redis-cluster", service="exporter-redis-6379"} 16
redis_cluster_known_nodes{group="group-a", instance="node-4", 
job="redis-cluster", service="exporter-redis-6379"} 16
redis_cluster_known_nodes{group="group-a", instance="node-5", 
job="redis-cluster", service="exporter-redis-6379"} 16

They are all the same "service", but how come instance "node-1" contains or 
sees 10 "nodes", but instance "node-2" contains or sees 11 "nodes", and the 
other instances contain or see 16 "nodes"?  Perhaps this inconsistency is 
the error you're trying to detect - in which case, what do you think is the 
correct number of nodes?

Let's say 16 is the correct answer for group="group-a" and 
service="exporter-redis-6379".  Perhaps you didn't show the full set of 
"up" metrics.  In which case, I'd first try to build an "up" query which 
gives the expected answer 16 on the right-hand side.  Maybe something like 
this:

count by (service, group) (up{service=~"exporter-redis-.*"})

What does that expression show?

When you have that part working, then we can work on matching the LHS.  
Since each *instance* seems to have its own distinct idea of the total 
number of nodes, then I expect this requires an N:1 match on 
(group,service).  That is, there is 1 "should be" value for a given 
(service,group) on the RHS, and multiple nodes each with their own count of 
(service,group) on the LHS.

If that's the case, it might end up something like this:

redis_cluster_known_nodes != on (service, group) group left() count by 
(service, group) (up{service=~"exporter-redis-.*"})

but at this point I'm just speculating.

On Monday, 17 October 2022 at 21:12:49 UTC+1 marc.k...@gmail.com wrote:

> Thanks for the pointer Brian.
>
> From what you suggested; I updated my query to include `service` rather 
> than `job` to cover the different values (representing either redis service 
> on each `instance`), however I'm still not getting the results I expect:
>
> query: 
> redis_cluster_known_nodes != on (instance, service, group) count by 
> (instance, service, group) (up{service=~"exporter-redis-.*"})
>
> result:
> {group="group-a", instance="node-1", service="exporter-redis-6379"} 10
> {group="group-a", instance="node-1", service="exporter-redis-6380"} 10
> {group="group-a", instance="node-2", service="exporter-redis-6379"} 11
> {group="group-a", instance="node-2", service="exporter-redis-6380"} 16
> {group="group-a", instance="node-3", service="exporter-redis-6379"} 16
> {group="group-a", instance="node-3", service="exporter-redis-6380"} 16
> {group="group-a", instance="node-4", service="exporter-redis-6379"} 16
> {group="group-a", instance="node-4", service="exporter-redis-6380"} 16
> {group="group-a", instance="node-5", service="exporter-redis-6379"} 16
> {group="group-a", instance="node-5", service="exporter-redis-6380"} 16
>
> I would expect only those who's count is != 10 be included in the result.
>
>
> Here's a metric sample of those used in the query:
> ``` 
> up{group="group-a", instance="node

[prometheus-users] Re: PromQL: multiple queries with dependent values

2022-10-17 Thread marc koser
Thanks for the pointer Brian.

>From what you suggested; I updated my query to include `service` rather 
than `job` to cover the different values (representing either redis service 
on each `instance`), however I'm still not getting the results I expect:

query: 
redis_cluster_known_nodes != on (instance, service, group) count by 
(instance, service, group) (up{service=~"exporter-redis-.*"})

result:
{group="group-a", instance="node-1", service="exporter-redis-6379"} 10
{group="group-a", instance="node-1", service="exporter-redis-6380"} 10
{group="group-a", instance="node-2", service="exporter-redis-6379"} 11
{group="group-a", instance="node-2", service="exporter-redis-6380"} 16
{group="group-a", instance="node-3", service="exporter-redis-6379"} 16
{group="group-a", instance="node-3", service="exporter-redis-6380"} 16
{group="group-a", instance="node-4", service="exporter-redis-6379"} 16
{group="group-a", instance="node-4", service="exporter-redis-6380"} 16
{group="group-a", instance="node-5", service="exporter-redis-6379"} 16
{group="group-a", instance="node-5", service="exporter-redis-6380"} 16

I would expect only those who's count is != 10 be included in the result.


Here's a metric sample of those used in the query:
``` 
up{group="group-a", instance="node-1", job="redis-cluster", 
service="exporter-redis-6379", team="sre"} 1
up{group="group-a", instance="node-1", job="redis-cluster", 
service="exporter-redis-6380", team="sre"} 1
up{group="group-a", instance="node-2", job="redis-cluster", 
service="exporter-redis-6379"} 1
up{group="group-a", instance="node-2", job="redis-cluster", 
service="exporter-redis-6380"} 1
up{group="group-a", instance="node-3", job="redis-cluster", 
service="exporter-redis-6379"} 1
up{group="group-a", instance="node-3", job="redis-cluster", 
service="exporter-redis-6380"} 1
up{group="group-a", instance="node-4", job="redis-cluster", 
service="exporter-redis-6379"} 1
up{group="group-a", instance="node-4", job="redis-cluster", 
service="exporter-redis-6380"} 1
up{group="group-a", instance="node-5", job="redis-cluster", 
service="exporter-redis-6379"} 1
up{group="group-a", instance="node-5", job="redis-cluster", 
service="exporter-redis-6380"} 1

redis_cluster_known_nodes{group="group-a", instance="node-1", 
job="redis-cluster", service="exporter-redis-6379", team="sre"} 10
redis_cluster_known_nodes{group="group-a", instance="node-1", 
job="redis-cluster", service="exporter-redis-6380", team="sre"} 10
redis_cluster_known_nodes{group="group-a", instance="node-2", 
job="redis-cluster", service="exporter-redis-6379"} 11
redis_cluster_known_nodes{group="group-a", instance="node-2", 
job="redis-cluster", service="exporter-redis-6380"} 16
redis_cluster_known_nodes{group="group-a", instance="node-3", 
job="redis-cluster", service="exporter-redis-6379"} 16
redis_cluster_known_nodes{group="group-a", instance="node-3", 
job="redis-cluster", service="exporter-redis-6380"} 16
redis_cluster_known_nodes{group="group-a", instance="node-4", 
job="redis-cluster", service="exporter-redis-6379"} 16
redis_cluster_known_nodes{group="group-a", instance="node-4", 
job="redis-cluster", service="exporter-redis-6380"} 16
redis_cluster_known_nodes{group="group-a", instance="node-5", 
job="redis-cluster", service="exporter-redis-6379"} 16
redis_cluster_known_nodes{group="group-a", instance="node-5", 
job="redis-cluster", service="exporter-redis-6380"} 16
```
On Thursday, October 13, 2022 at 9:17:55 AM UTC-4 Brian Candler wrote:

> Sorry, second to last sentence was unclear.  What I meant was:
>
>
> *If the LHS vector contains N metrics with a particular value of the 
> "group" label, which correspond to exactly 1 metric on the RHS with the 
> matching label value, or vice versa, then you can use N:1 matching.*
> On Thursday, 13 October 2022 at 14:13:42 UTC+1 Brian Candler wrote:
>
>> > Is it possible to have one side of a query limit the results of another 
>> part of the same query?
>>
>> Yes, but it depends on exactly what you mean. The details are here:
>> https://prometheus.io/docs/prometheus/latest/querying/operators/
>> It depends on whether you can construct vectors for the LHS and RHS which 
>> have corresponding labels.
>>
>> If you can give some specific examples of the metrics themselves - 
>> including all their labels - then we can see whether it's possible to do 
>> what you want in PromQL.  Right now the requirements are unclear.
>>
>>
>> *> redis_cluster_known_nodes != 
>> scalar(count(up{service=~"redis-exporter"}))*
>> > 
>> > The shared label value would be something like, *group="cluster-a" *and 
>> should not evaluate metrics where *group="cluster-b"*
>>
>> You need to arrange both LHS and RHS to have some corresponding labels 
>> before you can combine them with any operator such as !=.  The RHS has no 
>> "group" label at the moment, in fact it's not even a vector, but you could 
>> do:
>>
>> count by (group) (up{service="redis-exporter"})
>>
>> Then, assuming that redis_cluster_k

[prometheus-users] Re: PromQL: multiple queries with dependent values

2022-10-13 Thread Brian Candler
Sorry, second to last sentence was unclear.  What I meant was:


*If the LHS vector contains N metrics with a particular value of the 
"group" label, which correspond to exactly 1 metric on the RHS with the 
matching label value, or vice versa, then you can use N:1 matching.*
On Thursday, 13 October 2022 at 14:13:42 UTC+1 Brian Candler wrote:

> > Is it possible to have one side of a query limit the results of another 
> part of the same query?
>
> Yes, but it depends on exactly what you mean. The details are here:
> https://prometheus.io/docs/prometheus/latest/querying/operators/
> It depends on whether you can construct vectors for the LHS and RHS which 
> have corresponding labels.
>
> If you can give some specific examples of the metrics themselves - 
> including all their labels - then we can see whether it's possible to do 
> what you want in PromQL.  Right now the requirements are unclear.
>
>
> *> redis_cluster_known_nodes != 
> scalar(count(up{service=~"redis-exporter"}))*
> > 
> > The shared label value would be something like, *group="cluster-a" *and 
> should not evaluate metrics where *group="cluster-b"*
>
> You need to arrange both LHS and RHS to have some corresponding labels 
> before you can combine them with any operator such as !=.  The RHS has no 
> "group" label at the moment, in fact it's not even a vector, but you could 
> do:
>
> count by (group) (up{service="redis-exporter"})
>
> Then, assuming that redis_cluster_known_nodes also has a "group" label, 
> you can do:
>
> redis_cluster_known_nodes != on (group) count by (group) 
> (up{service="redis-exporter"})
>
> This will work as long as the LHS and RHS both have exactly *one* metric 
> for a given value of the "group" label.
>
> If the LHS has N values of "group" for 1 on the RHS, or vice versa, then 
> you can use N:1 matching as described in the documentation ("group left" or 
> "group right").
>
> If there are multiple matches on both LHS and RHS for the same value of 
> group, then the query will fail.  You will have to include some more labels 
> in the on(...) list to get a unique match.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/5f17c0a4-e1aa-447c-acdc-561b0a807d9an%40googlegroups.com.


[prometheus-users] Re: PromQL: multiple queries with dependent values

2022-10-13 Thread Brian Candler
> Is it possible to have one side of a query limit the results of another 
part of the same query?

Yes, but it depends on exactly what you mean. The details are here:
https://prometheus.io/docs/prometheus/latest/querying/operators/
It depends on whether you can construct vectors for the LHS and RHS which 
have corresponding labels.

If you can give some specific examples of the metrics themselves - 
including all their labels - then we can see whether it's possible to do 
what you want in PromQL.  Right now the requirements are unclear.


*> redis_cluster_known_nodes != 
scalar(count(up{service=~"redis-exporter"}))*
> 
> The shared label value would be something like, *group="cluster-a" *and 
should not evaluate metrics where *group="cluster-b"*

You need to arrange both LHS and RHS to have some corresponding labels 
before you can combine them with any operator such as !=.  The RHS has no 
"group" label at the moment, in fact it's not even a vector, but you could 
do:

count by (group) (up{service="redis-exporter"})

Then, assuming that redis_cluster_known_nodes also has a "group" label, you 
can do:

redis_cluster_known_nodes != on (group) count by (group) 
(up{service="redis-exporter"})

This will work as long as the LHS and RHS both have exactly *one* metric 
for a given value of the "group" label.

If the LHS has N values of "group" for 1 on the RHS, or vice versa, then 
you can use N:1 matching as described in the documentation ("group left" or 
"group right").

If there are multiple matches on both LHS and RHS for the same value of 
group, then the query will fail.  You will have to include some more labels 
in the on(...) list to get a unique match.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/09bdd5d2-59ea-451f-a431-b2d9665417afn%40googlegroups.com.