The actual comparison is <= which is why you received the alert. But if your tolerances are tight enough that <= matters over < then you are probably too tight on your tolerances.
I would first recommend that you tweak the sigmas value, may increase it to 3.5 or 4. To iterate quickly on for these tests I recommend that you create a recording of the data set and then tweak value replay the recording check the results, and repeat until you have something you like. If you share your recording with me I would be willing to take a quick look as well. As it is its a little hard to give good advice based of a handful of data points. On Tuesday, November 8, 2016 at 7:47:39 AM UTC-7, amith...@gmail.com wrote: > > On Thursday, 27 October 2016 21:46:08 UTC+5:30, nath...@influxdb.com > wrote: > > Clarification from Amith: > > > > > > > > > > > > > > Hi Nathaniel, > > > > > > Thanks a lot for your quick reply, what is confusing for me here is how > morgoth calculated anomalyScore field whose value has turned out to be > 0.9897172236503856. And how is this being used to detect anomaly. > > How does this particular node function > > > > > > > > … > > > > @morgoth() > > .field(field) > > .scoreField(scoreField) > > .minSupport(minSupport) > > .errorTolerance(errorTolerance) > > .consensus(consensus) > > // Configure a single Sigma fingerprinter > > > > > > > > > > .sigma(sigmas). > > > > > > You can choose some arbitrary data to help me understand this. :) > > Thanks, > > Amith > > > > > > My response: > > > > > > The `anomalyScore` is `1 - averageSupport`, where averageSupport is the > average of the support values returned from each or the fingerprinters. In > your case you only have one fingerprinter `sigma` so using the anomalyScore > of ~ `0.99` we can determine that the sigma fingerprinter returned a > support of ~ `0.01`. Support is defined as `count / total`, where count is > the number of times a specific event has been seen and total is the total > number events seen. The support can be interpreted as a frequency > percentage, i.e. the most recent window has only been seen 1% of the time. > Since 0.01 is < 0.05 (the min support defined) an anomaly was triggered. > Taking this back to the anomaly score it can be interpreted that 99% of the > time we do not see an event like this one. > > > > > > Remember that Morgoth distinguishs different windows as different events > using the fingerprinters. In your case the sigma function is computing the > std deviation and mean of the windows it receives. If a window arrives that > is more than 3 stddevs away from the mean than it is not considered the > same event and is a unique event. > > > > > > Taking all of that and putting it together receiving an anomaly score of > 99% out of Morgoth for your setup can be interpreted as: You have sent > several 1m windows to Morgoth. The window that triggered the anomaly event > is only similar to ~1% of those windows, where similar is defined as being > within 3 std deviations. > > > > > > > > > > On Thursday, October 27, 2016 at 9:30:13 AM UTC-6, nath...@influxdb.com > wrote: > > > > > > > > In short there are two parts to Morgoth. > > > > > > 1. A system that counts the frequency of different kinds of events. This > is the lossy counting part > > 2. A system that determines if a window of data is the same as an > existing event being tracked or something new. This is the fingerprinting > part. > > > > > > > > Here is a quick read through for those concepts > http://docs.morgoth.io/docs/detection_framework/ > > > > > > > > Its a little hard to tell if Morgoth has done anything unexpected > without more detail. Can you share some of the data that lead to this > alert, so I can talk to the specifics of what is going on? Or maybe you > could ask a more specific question about which part is confusing? > > > > > > > > > > On Thursday, October 27, 2016 at 6:47:02 AM UTC-6, amith...@gmail.com > wrote:Hi All, > > I am trying to run morgoth as a child process to kapacitor, but I am > failing understand how morgoth functions. Below is the sample tick script I > tried out of the Morgoth docs. This is generating some alerts but I am > unable to figure out if they are suppose to get triggered way they have. > Pasting a snippet out of alert as well. > > I basically want to understand the functioning of Morgoth through this > example. > > Alert > > =================================================================== > > { > > "id":"cpu:cpu=cpu-total,host=ip-10-121-48-24.ec2.internal,", > > "message":"cpu:cpu=cpu-total,host=ip-10-121-48-24.ec2.internal, is > CRITICAL", > > "details":"", > > "time":"2016-10-27T11:33:00Z", > > "duration":21780000000000, > > "level":"CRITICAL", > > "data":{ > > "series":[ > > { > > "name":"cpu", > > "tags":{ > > "cpu":"cpu-total", > > "host":"ip-10-121-48-24.ec2.internal" > > }, > > "columns":[ > > "time", > > "anomalyScore", > > "usage_guest", > > "usage_guest_nice", > > "usage_idle", > > "usage_iowait", > > "usage_irq", > > "usage_nice", > > "usage_softirq", > > "usage_steal", > > "usage_system", > > "usage_user" > > ], > > "values":[ > > [ > > "2016-10-27T11:33:00Z", > > 0.9897172236503856, > > 0, > > 0, > > 99.49748743708487, > > 0, > > 0, > > 0, > > 0, > > 0, > > 0.5025125628122904, > > 0 > > ] > > =================================================================== > > // The measurement to analyze > > var measurement = 'cpu' > > // Optional group by dimensions > > var groups = [*] > > // Optional where filter > > var whereFilter = lambda: TRUE > > // The amount of data to window at once > > var window = 1m > > // The field to process > > var field = 'usage_idle' > > // The name for the anomaly score field > > var scoreField = 'anomalyScore' > > // The minimum support > > var minSupport = 0.05 > > // The error tolerance > > var errorTolerance = 0.01 > > // The consensus > > var consensus = 0.5 > > // Number of sigmas allowed for normal window deviation > > var sigmas = 3.0 > > stream > > // Select the data we want > > |from() > > .measurement(measurement) > > .groupBy(groups) > > .where(whereFilter) > > // Window the data for a certain amount of time > > |window() > > .period(window) > > .every(window) > > .align() > > // Send each window to Morgoth > > @morgoth() > > .field(field) > > .scoreField(scoreField) > > .minSupport(minSupport) > > .errorTolerance(errorTolerance) > > .consensus(consensus) > > // Configure a single Sigma fingerprinter > > .sigma(sigmas) > > // Morgoth returns any anomalous windows > > |alert() > > .details('') > > .crit(lamda: TRUE) > > .log('/tmp/cpu_alert.log') > > Thanks a lot Nathaneil for your explanation on Morgoth, I have come back > with a new example and its set of alerts. I will brief on what I am trying > to achieve here. > > Below a set of data with count of errors(eventcount) that occurred for a > particular errorcode out of IIS logs. I want to run Morgoth on field > eventcount to detect if its an anomaly. > > time app eventcount status tech > 2016-11-07T11:31:28.261Z "OTSI" 586 "Success" > "IIS" > > 2016-11-07T11:32:03.254Z "OTSI" 1 "Failure" "IIS" > > 2016-11-07T11:33:03.243Z "OTSI" 8 "Success" "IIS" > > 2016-11-07T11:33:23.259Z "ANALYTICS" 158 "Success" > "IIS" > > 2016-11-07T11:33:23.26Z "ANALYTICS" 24 "Failure" > "IIS" > > > My tickscript: > > TICKscript: > // The measurement to analyze > var measurement = 'eventflow_IIS' > > // The amount of data to window at once > var window = 1m > > // The field to process > var field = 'eventcount' > > // The name for the anomaly score field > var scoreField = 'anomalyScore' > > // The minimum support > var minSupport = 0.05 > > // The error tolerance > var errorTolerance = 0.01 > > // The consensus > var consensus = 0.5 > > // Number of sigmas allowed for normal window deviation > var sigmas = 3.0 > > batch > |query(''' > SELECT * > FROM "statistics"."autogen"."eventflow_IIS" > ''') > .period(1m) > .every(1m) > .groupBy(*) > // |.where(lambda: TRUE) > @morgoth() > .field(field) > .scoreField(scoreField) > .minSupport(minSupport) > .errorTolerance(errorTolerance) > .consensus(consensus) > // Configure a single Sigma fingerprinter > .sigma(sigmas) > // Morgoth returns any anomalous windows > |alert() > .details('Count is anomalous') > .id('kapacitor/{{ .TaskName }}/{{ .Name }}/{{ .Group }}') > .message('{{ .ID }} is at level {{ .Level }} Errorcount is:{{ > index .Fields "eventcount" }}') > .crit(lambda: TRUE) > .log('/tmp/morgothbb.log') > |influxDBOut() > .database('anomaly') > .retentionPolicy('autogen') > .flushInterval(1s) > .measurement('Anomaly') > // .tag('eventcount','field') > // .tag('AnomalyScore','scoreField') > // .tag('Time','time') > // .tag('Status','status') > .precision('u') > > Below is the alert what it has generated pumped into a table. > > time > anomalyScore app eventcount status > tech > > 2016-11-08T09:34:40.169285533Z 0.95 > "OTSI" 296 "Success" "IIS" > > 2016-11-08T09:35:40.171285533Z 0.9523809523809523 > "OTSI" 28 "Success" "IIS" > > 2016-11-08T09:36:40.170285533Z 0.9545454545454546 > "OTSI" 12 "Success" "IIS" > > 2016-11-08T09:37:40.169285533Z 0.9565217391304348 > "OTSI" 20 "Success" "IIS" > > 2016-11-08T09:38:40.170285533Z 0.9583333333333334 > "OTSI" 249 "Success" "IIS" > > 2016-11-08T09:39:40.167285533Z 0.96 > "OTSI" 70 "Success" "IIS" > > 2016-11-08T09:43:00.167285533Z 0.9615384615384616 > "ANALYTICS" 1 "Success" "IIS" > > 2016-11-08T09:43:40.164285533Z 0.962962962962963 > "OTSI" 24 "Success" "IIS" > > 2016-11-08T09:52:00.160285533Z 0.9642857142857143 > "ANALYTICS" 1 "Success" "IIS" > > > My question is: > > How to interpret the anomaly score generated here ~0.95 with the counts > for which Morgoth has triggered an Anomaly.Going by our earliar discussion > Support here turns out to be ~0.05 (1- Anomaly Score). And anomaly gets > triggered when (support < Min Support), so in this case it turns out 0.05 < > 0.05 which should not be true. But still anomaly is getting triggered > almost every minute. Could you please help me understand this. > > Also let me know if e,M,N need to be tweaked here for this particular data > sample to generate meaningful alert out of it. -- Remember to include the version number! --- You received this message because you are subscribed to the Google Groups "InfluxData" group. To unsubscribe from this group and stop receiving emails from it, send an email to influxdb+unsubscr...@googlegroups.com. To post to this group, send email to influxdb@googlegroups.com. Visit this group at https://groups.google.com/group/influxdb. To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/1041c8e4-025e-4b3b-b576-e9c97cee86ee%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.