Also, currently, I have only tried a single way to give custom threshold
i.e based on the component name. For example, for all the targets under
Comp-A have a threshold of 99.9 and all the targets under Comp-B have a
threshold of 95.
But now, I have to give a common custom threshold let say 98 to 5 different
targets, all of which belong to 5 different components and all the 5
components have more than 1 target but I want the custom threshold to be
applied for only a single target from each component.

On Fri, Jul 3, 2020 at 12:02 AM Yagyansh S. Kumar <yagyanshsku...@gmail.com>
wrote:

> Hi Christian,
>
> Actually, I want to another if there is any better way to define the
> threshold for my 5 new servers that belong to 5 different components. Is
> writing 5 different recording rules with the same name, and different
> instance and component labels only way to proceed here? Won't that be a
> little too dirty to maintain? What if it was 20 servers all belonging to a
> different component?
>
> On Tue, Jun 30, 2020 at 11:43 AM Christian Hoffmann <
> m...@hoffmann-christian.info> wrote:
>
>> Hi,
>>
>> On 6/24/20 8:09 PM, yagyans...@gmail.com wrote:
>> > Hi. Currently I am using a custom threshold in case of my Memory alerts.
>> > I have 2 main labels for my every node exporter target - cluster and
>> > component.
>> > My custom threshold till now has been based on the component as I had to
>> > define that particular custom threshold for all the servers of the
>> > component. But now, I have 5 instances, all from different components
>> > and I have to set the threshold as 97. How do approach this?
>> >
>> > My typical node exporter job.
>> >   - job_name: 'node_exporter_JOB-A'
>> >     static_configs:
>> >     - targets: [ 'x.x.x.x:9100' , 'x.x.x.x:9100']
>> >       labels:
>> >         cluster: 'Cluster-A'
>> >         env: 'PROD'
>> >         component: 'Comp-A'
>> >     scrape_interval: 10s
>> >
>> > Recording rule for custom thresholds.
>> >   - record: abcd_critical
>> >     expr: 99.9
>> >     labels:
>> >       component: 'Comp-A'
>> >
>> >   - record: xyz_critical
>> >     expr: 95
>> >     labels:
>> >       node: 'Comp-B'
>> >
>> > The expression for Memory Alert.
>> > ((node_memory_MemTotal_bytes - node_memory_MemFree_bytes -
>> > node_memory_Cached_bytes) / node_memory_MemTotal_bytes * 100) *
>> > on(instance) group_left(nodename) node_uname_info > on(component)
>> > group_left() (*abcd_critical* or *xyz_critical* or on(node) count by
>> > (component)((node_memory_MemTotal_bytes - node_memory_MemFree_bytes -
>> > node_memory_Cached_bytes) / node_memory_MemTotal_bytes * 100) * 0 + 90)
>> >
>> > Now, I have 5 servers with different components. How to include that in
>> > the most optimized manner?
>>
>> This looks almost like the pattern described here:
>> https://www.robustperception.io/using-time-series-as-alert-thresholds
>>
>> It looks like you already tried to integrate the two different ways to
>> specific thresholds, right? Is there any specific problem with it?
>>
>> Sadly, this pattern quickly becomes complex, especially if nested (like
>> you would need to do) and if combined with an already longer query (like
>> in your case).
>>
>> I can only suggest to try to move some of the complexity out of the
>> query (e.g. by moving the memory calculation to a recording rule instead).
>>
>> You can also split the rule into multiple rules (with the same name).
>> You will just have to ensure that they only ever fire for a subset of
>> your instances (e.g. the first variant would only fire for
>> compartment-based thresholds, the second only for instance-based
>> thresholds).
>>
>> Hope this helps.
>>
>> Kind regards,
>> Christian
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAFGi5vB8S0_Gi03HSS%2BUFnQ%3DmWrWVwoBSAxJDhS3ed9r4QcTEA%40mail.gmail.com.

Reply via email to