[influxdb] Re: kapacitor sending ok alerts as state change on start up

Archie Archbold Wed, 22 Feb 2017 11:24:06 -0800

Thanks so much for the reply. I do want the recovery alerts, but the 
problem is that when I start kapacitor, the task sees *any* server/path in 
an up status as a recovery of a *different* server/path's down status. So 
if 7 server/paths are in a down status at the time of start-up I get 7 down 
alerts (expected) but they are immediately followed by 7 recovery messages 
from different server/paths. Please let me know if I am not being clear 
enough.


On Wednesday, February 22, 2017 at 11:10:09 AM UTC-8, [email protected] 
wrote:
>
> If you want to ignore the OK alerts use the `.noRecoveries` property of 
> the alert node. This will suppress the OK alerts.
>
> On Friday, February 17, 2017 at 3:33:16 PM UTC-7, Archie Archbold wrote:
>>
>> Hey all. Pretty new to TICK but I have a problem that I can't wrap my 
>> head around.
>>
>> I am monitoring multiple servers all sending data to one influxdb 
>> database and using the 'host' tag to separate the servers in the DB
>>
>> My 'disk' measurement  is taking in mulitiple disk paths from the servers 
>> (HOSTS) which each have a respective 'PATH' tag.
>>
>> So basically each server is assigned a HOST tag and each HOST has 
>> multiple PATH tags.
>>
>> EXPECTED FUNCTIONALITY: kapacitor should alert upon state change of a 
>> HOST's PATH if that path is within the alerting Lambda. 
>> PROBLEM: When I start the kapacitor service, it looks like it's sensing a 
>> state change any time it sees another host/path with a opposite status.
>>
>> This is a simplified example of the alerts I am getting:
>>
>> Host: host1  Path: /path1  Status: UP
>> Host: host1  Path: /path2  Status: DOWN
>> Host: host1  Path: /path3  Status: UP
>> Host: host2  Path: /path1 Status: DOWN
>> Host: host2  Path: /path2  Status: UP
>>
>> These alerts happen once for each host/path combination and then the 
>> service performs as expected, alerting properly when lambda is achieved.
>>
>> The result of this is that I receive a slew of up/down alerts every time 
>> I restart the kapacitor service
>>
>> Here is my current tick:
>> var data = stream
>>     |from()
>>         .measurement('disk')
>>         .groupBy('host','path')       
>>     |alert()
>>         .message('{{ .ID }} Server:{{ index .Tags "host" }} Path: {{ 
>> index .Tags "path" }} USED PERCENT: {{ index .Fields "used_percent" }}')
>>         .warn(lambda: "used_percent" >= 80)
>>      .id('DISK SPACE WARNING')
>>         .email($DISK_WARN_GRP)
>>
>> And the corresponding DOT
>>
>> ID: disk_alert_warn
>>
>> Error: 
>>
>> Template: 
>>
>> Type: stream
>>
>> Status: enabled
>>
>> Executing: true
>>
>> Created: 17 Feb 17 22:27 UTC
>>
>> Modified: 17 Feb 17 22:27 UTC
>>
>> LastEnabled: 17 Feb 17 22:27 UTC
>>
>> Databases Retention Policies: ["main"."autogen"]
>>
>> TICKscript:
>>
>> var data = stream
>>
>>     |from()
>>
>>         .measurement('disk')
>>
>>         .groupBy('host', 'path')
>>
>>     |alert()
>>
>>         .message('{{ .ID }} Server:{{ index .Tags "host" }} Path: {{ 
>> index .Tags "path" }} USED PERCENT: {{ index .Fields "used_percent" }}')
>>
>>         .warn(lambda: "used_percent" >= 80)
>>
>>         .id('DISK SPACE WARNING')
>>
>>         .email()
>>
>>
>> DOT:
>>
>> digraph disk_alert_warn {
>>
>> graph [throughput="38.00 points/s"];
>>
>>
>> stream0 [avg_exec_time_ns="0s" ];
>>
>> stream0 -> from1 [processed="284"];
>>
>>
>> from1 [avg_exec_time_ns="3.9µs" ];
>>
>> from1 -> alert2 [processed="284"];
>>
>>
>> alert2 [alerts_triggered="14" avg_exec_time_ns="72.33µs" 
>> crits_triggered="0" infos_triggered="0" oks_triggered="7" 
>> warns_triggered="7" ];
>>
>> }
>>
>> As you can see, I get 7 oks triggered (for host/path groups that are not 
>> in alert range) and 7 warns triggered (for the 7 host/path groups that are 
>> within the alert range) upon start up.
>> Then it behaves as normal.
>>
>> I understand that it should be alerting for the 7 host/path groups that 
>> are over 80 but why follow it with an alert about the ok groups?
>>
>> MORE INFO: When I raise the lambda to 90% (out of range for all 
>> host/paths) I get no alerts at all (which is expected)
>>
>> Thanks to anyone who can help me understand this
>>
>

-- 
Remember to include the version number!
--- 
You received this message because you are subscribed to the Google Groups 
"InfluxData" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/influxdb/c7b9a16b-f6b8-4bb8-a0c6-3de0172ce217%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[influxdb] Re: kapacitor sending ok alerts as state change on start up

Reply via email to