I completely agree with Nick’s transparency comments, and like the design of the configuration, especially provision for messaging around the nature of the rule fired.
I would just like to add a small point on the capabilities here. If the message could have embedded values through some sort of template for a stellar statement, it would make for a better more dynamic alert reason. I would also like to see the score field capable of outputting the value of a stellar statement. At the moment the idea of a static score being passed on means that if I have a probabilistic result I want to combine with other triage sources, I have to do a lot of bucketing into fixed values. I would much rather be able to say something like score = some stellar statement that returns a float, ‘alertness' = threshold of this. That way I can combine multiple triage rules to trigger an overall alert, making the aggregators more meaningful. Simon > On 2 Feb 2017, at 12:40, Carolyn Duby <cd...@hortonworks.com> wrote: > > For profiler alerts it will be helpful during analysis to see the alerts that > caused the anomaly. The meta alert is useful for incidents involving > correlation of multiple events. > > Also you will need to filter out known hosts that trigger anomalies. For > example vulnerability scanning software. > > One final thing to consider is anomalies happen every day without a security > incident. Depending on the network the profiler alerts could get very noisy > so it might be better to correlate profiler alerts with other alerts. > > Thanks > Carolyn > > > > Sent from my Verizon, Samsung Galaxy smartphone > > > -------- Original message -------- > From: Casey Stella <ceste...@gmail.com> > Date: 2/1/17 2:28 PM (GMT-05:00) > To: dev@metron.incubator.apache.org > Subject: Re: [Discuss] Improve Alerting > > I like the direction. One thing that we may want is for comment to just be > a stellar expression and construct a function to essentially do > String.format(). So, that'd become: > "triageConfig" : { > "riskLevelRules" : [ > { > "name" : "Abnormal Value", > "comment" : "FORMAT('For %s; the value %s exceeds threshold of %d', > hostname, value, value_threshold)" > "rule" : "value > value_threshold", > "score" : 10 > } > ], > "aggregator" : "MAX" > } > > The reason: > > - It's integrated and stellar is our default scripting layer > - It supports doing some computation in the message > > > On Wed, Feb 1, 2017 at 2:21 PM, Nick Allen <n...@nickallen.org> wrote: > >> Like I said, here is a proposed solution to one of the gaps I identified in >> the previous email. >> >> *Problem* >> >> There is little transparency into the Threat Triage process itself. When >> Threat Triage runs, all I get is a score. I don't know how that score was >> arrived at, which rules were triggered, and the specific values that caused >> a rule to trigger. >> >> More specifically, there is no way to generate a message that looks like >> "The host 'powned.svr.bank.com' has '230' inbound flows, exceeding the >> threshold of '202'". This makes it difficult for an analyst to action the >> alert. >> >> *Proposed Solution* >> >> To improve the transparency of the Threat Triage process, I am proposing >> these enhancements. >> >> 1. Threat Triage should attach to each message all of the rules that fired >> in addition to the total calculated threat triage score. >> >> 2. Threat Triage should allow a custom message to be generated for each >> rule. The custom message would allow for some form of string interpolation >> so that I can add specific values from each message to the generated >> alert. We could allow this in one or both of the new fields that Casey >> just added, name and comment. >> >> >> *Example* >> >> 1. In this example, we have a telemetry message with a field called 'value' >> that we need to monitor. In Enrichment, I calculate some sort of value >> threshold, over which an alert should be generated. >> >> >> 2. In Threat Triage, I use the calculated value threshold to alert on any >> message that has a value exceeding this threshold. >> >> 3. I can embed values from the message, like the hostname, value, and value >> threshold, into the alert produced by Threat Triage. Notice that I am >> using ${this} for string interpolation, but it could be any syntax that we >> choose. >> >> >> "triageConfig" : { >> "riskLevelRules" : [ >> { >> "name" : "Abnormal Value", >> "comment" : "For ${hostname}; the value ${value} exceeds threshold of >> ${value_threshold}", >> "rule" : "value > value_threshold", >> "score" : 10 >> } >> ], >> "aggregator" : "MAX" >> } >> >> >> 4. The Threat Triage process today would add only the total calculated >> score. >> >> "threat.triage.level": 10.0 >> >> >> With this proposal, Threat Triage would add the following to the message. >> >> Notice how each of the ${variables} have been replaced with the actual >> values extracted from the message. This allows for more contextual >> information to action the alert. >> >> "threat.triage": { >> "score": 10.0, >> "rules": [ >> { >> "name": "Abnormal Value", >> "comment" : "For 10.0.0.1; the value 101 exceeds threshold of 42", >> "score" : 10 >> } >> ] >> } >> >> >> >> What do you think? Any alternative ideas? >> >> >> >> On Wed, Feb 1, 2017 at 2:11 PM, Nick Allen <n...@nickallen.org> wrote: >> >>> I'd like to explore the functionality that we have in Metron using a >>> motivating example. I think this will help highlight some gaps where we >>> can enhance Metron. >>> >>> The motivating example is that I would like to create an alert if the >>> number of inbound flows to any host over a 15 minute interval is >> abnormal. >>> I would like the alert to contain the specific information below to >>> streamline the triage process. >>> >>> Rule: Abnormal number of inbound flows >>> Bin: 15 mins >>> Alert: The host 'powned.svr.bank.com' has '230' inbound flows, exceeding >>> the threshold of '202' >>> >>> >>> *What Works* >>> >>> In some ways, this example is similar to the "Outlier Detection" demo >> that >>> I performed with the Profiler a few months back. We have most of what >> we >>> need to do this with a couple caveats. >>> >>> 1. An enrichment would be added to enrich the message with the correct >>> internal hostname 'powned.svr.bank.com'. >>> >>> 2. With the Profiler, I can capture some idea of what "normal" is for the >>> number of inbound flows across 15 minute intervals. >>> 3. With Threat Triage, I can create rules that alert when a value exceeds >>> what the Profiler defines as normal. >>> >>> >>> *What's Missing* >>> >>> Its nice to know that we are almost all the way there with this example. >>> Unfortunately, there are two gaps that fall out of this. >>> >>> 1. *Threat Triage Transparency* >>> >>> There is little transparency into the Threat Triage process itself. When >>> Threat Triage runs, all I get is a score. I don't know how that score >> was >>> arrived at, which rules were triggered, and the specific values that >> caused >>> a rule to trigger. >>> >>> More specifically, there is no way to generate a message that looks like >>> "The host 'powned.svr.bank.com' has '230' inbound flows, exceeding the >>> threshold of '202'". >>> >>> >>> 2. *Triage Calculated Values from the Profiler* >>> >>> Also, the value being interrogated here, the number of inbound flows, is >>> not a static value contained within any single telemetry message. This >>> value is calculated across multiple messages by the Profiler. The >> current >>> Threat Triage process cannot be used to interrogate values calculated by >>> the Profiler. >>> >>> >>> To try and keep this email concise and digestible, I am going to send a >>> follow-on discussing proposed solutions for each of these separately. >>> >>> >>> >>> >>> >>> >> >> >> -- >> Nick Allen <n...@nickallen.org> >>