To close out this discussion, I created another JIRA to take care of the "*Triage Calculated Values from the Profiler" *problem. Feel free to let me know if anything else was missed.
[1] Triage Metrics Produced by the Profiler https://issues.apache.org/jira/browse/METRON-701 On Thu, Feb 2, 2017 at 10:15 AM, Nick Allen <n...@nickallen.org> wrote: > I created 3 separate JIRAs to track the "Threat Triage Transparency" > portion of the work falling out of this discussion thread. The first would > create a mechanism to do string interpolation. The second would enhance > threat triage to use the string interpolation. The third would enhance the > output of threat triage. > > [1] Create String Formatting Function for Stellar > https://issues.apache.org/jira/browse/METRON-687 > > [2] Allow Threat Triage Comment Field to Contain Stellar Expressions > https://issues.apache.org/jira/browse/METRON-688 > > [3] Record of Rule Set that Fired During Threat Triage > https://issues.apache.org/jira/browse/METRON-686 > > Please let me know if anyone's concerns were not captured. I will create > additional JIRAs for the other portion of the effort (*Triage Calculated > Values from the Profiler)* once I've given everyone a little more time to > voice an opinion. > > > On Thu, Feb 2, 2017 at 9:46 AM, Nick Allen <n...@nickallen.org> wrote: > >> Oh, I see. Yes, very useful. >> >> >> On Thu, Feb 2, 2017 at 9:39 AM, Simon Elliston Ball < >> si...@simonellistonball.com> wrote: >> >>> That’s a part of it, certainly (and fixes another of my bug bears, so >>> thank you!) >>> >>> In addition to the aggregation being stellar, I want score to be a >>> stellar statement, I’ve put in a separate ticket for that. >>> https://issues.apache.org/jira/browse/METRON-685 < >>> https://issues.apache.org/jira/browse/METRON-685> >>> >>> Simon >>> >>> > On 2 Feb 2017, at 14:31, Nick Allen <n...@nickallen.org> wrote: >>> > >>> >> I would much rather be able to say something like score = some stellar >>> >> statement that returns a float... >>> > >>> > >>> > Completely agree. FYI - We added METRON-683 yesterday that I believe >>> > supports what you are saying. Feel free to add commentary. >>> > >>> > https://issues.apache.org/jira/browse/METRON-683 >>> > >>> > On Thu, Feb 2, 2017 at 9:02 AM, Simon Elliston Ball < >>> > si...@simonellistonball.com> wrote: >>> > >>> >> I completely agree with Nick’s transparency comments, and like the >>> design >>> >> of the configuration, especially provision for messaging around the >>> nature >>> >> of the rule fired. >>> >> >>> >> I would just like to add a small point on the capabilities here. If >>> the >>> >> message could have embedded values through some sort of template for a >>> >> stellar statement, it would make for a better more dynamic alert >>> reason. >>> >> >>> >> I would also like to see the score field capable of outputting the >>> value >>> >> of a stellar statement. At the moment the idea of a static score being >>> >> passed on means that if I have a probabilistic result I want to >>> combine >>> >> with other triage sources, I have to do a lot of bucketing into fixed >>> >> values. I would much rather be able to say something like score = some >>> >> stellar statement that returns a float, ‘alertness' = threshold of >>> this. >>> >> That way I can combine multiple triage rules to trigger an overall >>> alert, >>> >> making the aggregators more meaningful. >>> >> >>> >> Simon >>> >> >>> >> >>> >>> On 2 Feb 2017, at 12:40, Carolyn Duby <cd...@hortonworks.com> wrote: >>> >>> >>> >>> For profiler alerts it will be helpful during analysis to see the >>> alerts >>> >> that caused the anomaly. The meta alert is useful for incidents >>> involving >>> >> correlation of multiple events. >>> >>> >>> >>> Also you will need to filter out known hosts that trigger anomalies. >>> >> For example vulnerability scanning software. >>> >>> >>> >>> One final thing to consider is anomalies happen every day without a >>> >> security incident. Depending on the network the profiler alerts >>> could get >>> >> very noisy so it might be better to correlate profiler alerts with >>> other >>> >> alerts. >>> >>> >>> >>> Thanks >>> >>> Carolyn >>> >>> >>> >>> >>> >>> >>> >>> Sent from my Verizon, Samsung Galaxy smartphone >>> >>> >>> >>> >>> >>> -------- Original message -------- >>> >>> From: Casey Stella <ceste...@gmail.com> >>> >>> Date: 2/1/17 2:28 PM (GMT-05:00) >>> >>> To: dev@metron.incubator.apache.org >>> >>> Subject: Re: [Discuss] Improve Alerting >>> >>> >>> >>> I like the direction. One thing that we may want is for comment to >>> just >>> >> be >>> >>> a stellar expression and construct a function to essentially do >>> >>> String.format(). So, that'd become: >>> >>> "triageConfig" : { >>> >>> "riskLevelRules" : [ >>> >>> { >>> >>> "name" : "Abnormal Value", >>> >>> "comment" : "FORMAT('For %s; the value %s exceeds threshold of >>> %d', >>> >>> hostname, value, value_threshold)" >>> >>> "rule" : "value > value_threshold", >>> >>> "score" : 10 >>> >>> } >>> >>> ], >>> >>> "aggregator" : "MAX" >>> >>> } >>> >>> >>> >>> The reason: >>> >>> >>> >>> - It's integrated and stellar is our default scripting layer >>> >>> - It supports doing some computation in the message >>> >>> >>> >>> >>> >>> On Wed, Feb 1, 2017 at 2:21 PM, Nick Allen <n...@nickallen.org> >>> wrote: >>> >>> >>> >>>> Like I said, here is a proposed solution to one of the gaps I >>> >> identified in >>> >>>> the previous email. >>> >>>> >>> >>>> *Problem* >>> >>>> >>> >>>> There is little transparency into the Threat Triage process itself. >>> >> When >>> >>>> Threat Triage runs, all I get is a score. I don't know how that >>> score >>> >> was >>> >>>> arrived at, which rules were triggered, and the specific values that >>> >> caused >>> >>>> a rule to trigger. >>> >>>> >>> >>>> More specifically, there is no way to generate a message that looks >>> like >>> >>>> "The host 'powned.svr.bank.com' has '230' inbound flows, exceeding >>> the >>> >>>> threshold of '202'". This makes it difficult for an analyst to >>> action >>> >> the >>> >>>> alert. >>> >>>> >>> >>>> *Proposed Solution* >>> >>>> >>> >>>> To improve the transparency of the Threat Triage process, I am >>> proposing >>> >>>> these enhancements. >>> >>>> >>> >>>> 1. Threat Triage should attach to each message all of the rules that >>> >> fired >>> >>>> in addition to the total calculated threat triage score. >>> >>>> >>> >>>> 2. Threat Triage should allow a custom message to be generated for >>> each >>> >>>> rule. The custom message would allow for some form of string >>> >> interpolation >>> >>>> so that I can add specific values from each message to the generated >>> >>>> alert. We could allow this in one or both of the new fields that >>> Casey >>> >>>> just added, name and comment. >>> >>>> >>> >>>> >>> >>>> *Example* >>> >>>> >>> >>>> 1. In this example, we have a telemetry message with a field called >>> >> 'value' >>> >>>> that we need to monitor. In Enrichment, I calculate some sort of >>> value >>> >>>> threshold, over which an alert should be generated. >>> >>>> >>> >>>> >>> >>>> 2. In Threat Triage, I use the calculated value threshold to alert >>> on >>> >> any >>> >>>> message that has a value exceeding this threshold. >>> >>>> >>> >>>> 3. I can embed values from the message, like the hostname, value, >>> and >>> >> value >>> >>>> threshold, into the alert produced by Threat Triage. Notice that I >>> am >>> >>>> using ${this} for string interpolation, but it could be any syntax >>> that >>> >> we >>> >>>> choose. >>> >>>> >>> >>>> >>> >>>> "triageConfig" : { >>> >>>> "riskLevelRules" : [ >>> >>>> { >>> >>>> "name" : "Abnormal Value", >>> >>>> "comment" : "For ${hostname}; the value ${value} exceeds >>> threshold >>> >> of >>> >>>> ${value_threshold}", >>> >>>> "rule" : "value > value_threshold", >>> >>>> "score" : 10 >>> >>>> } >>> >>>> ], >>> >>>> "aggregator" : "MAX" >>> >>>> } >>> >>>> >>> >>>> >>> >>>> 4. The Threat Triage process today would add only the total >>> calculated >>> >>>> score. >>> >>>> >>> >>>> "threat.triage.level": 10.0 >>> >>>> >>> >>>> >>> >>>> With this proposal, Threat Triage would add the following to the >>> >> message. >>> >>>> >>> >>>> Notice how each of the ${variables} have been replaced with the >>> actual >>> >>>> values extracted from the message. This allows for more contextual >>> >>>> information to action the alert. >>> >>>> >>> >>>> "threat.triage": { >>> >>>> "score": 10.0, >>> >>>> "rules": [ >>> >>>> { >>> >>>> "name": "Abnormal Value", >>> >>>> "comment" : "For 10.0.0.1; the value 101 exceeds threshold of >>> >> 42", >>> >>>> "score" : 10 >>> >>>> } >>> >>>> ] >>> >>>> } >>> >>>> >>> >>>> >>> >>>> >>> >>>> What do you think? Any alternative ideas? >>> >>>> >>> >>>> >>> >>>> >>> >>>> On Wed, Feb 1, 2017 at 2:11 PM, Nick Allen <n...@nickallen.org> >>> wrote: >>> >>>> >>> >>>>> I'd like to explore the functionality that we have in Metron using >>> a >>> >>>>> motivating example. I think this will help highlight some gaps >>> where >>> >> we >>> >>>>> can enhance Metron. >>> >>>>> >>> >>>>> The motivating example is that I would like to create an alert if >>> the >>> >>>>> number of inbound flows to any host over a 15 minute interval is >>> >>>> abnormal. >>> >>>>> I would like the alert to contain the specific information below to >>> >>>>> streamline the triage process. >>> >>>>> >>> >>>>> Rule: Abnormal number of inbound flows >>> >>>>> Bin: 15 mins >>> >>>>> Alert: The host 'powned.svr.bank.com' has '230' inbound flows, >>> >> exceeding >>> >>>>> the threshold of '202' >>> >>>>> >>> >>>>> >>> >>>>> *What Works* >>> >>>>> >>> >>>>> In some ways, this example is similar to the "Outlier Detection" >>> demo >>> >>>> that >>> >>>>> I performed with the Profiler a few months back. We have most of >>> what >>> >>>> we >>> >>>>> need to do this with a couple caveats. >>> >>>>> >>> >>>>> 1. An enrichment would be added to enrich the message with the >>> correct >>> >>>>> internal hostname 'powned.svr.bank.com'. >>> >>>>> >>> >>>>> 2. With the Profiler, I can capture some idea of what "normal" is >>> for >>> >> the >>> >>>>> number of inbound flows across 15 minute intervals. >>> >>>>> 3. With Threat Triage, I can create rules that alert when a value >>> >> exceeds >>> >>>>> what the Profiler defines as normal. >>> >>>>> >>> >>>>> >>> >>>>> *What's Missing* >>> >>>>> >>> >>>>> Its nice to know that we are almost all the way there with this >>> >> example. >>> >>>>> Unfortunately, there are two gaps that fall out of this. >>> >>>>> >>> >>>>> 1. *Threat Triage Transparency* >>> >>>>> >>> >>>>> There is little transparency into the Threat Triage process itself. >>> >> When >>> >>>>> Threat Triage runs, all I get is a score. I don't know how that >>> score >>> >>>> was >>> >>>>> arrived at, which rules were triggered, and the specific values >>> that >>> >>>> caused >>> >>>>> a rule to trigger. >>> >>>>> >>> >>>>> More specifically, there is no way to generate a message that looks >>> >> like >>> >>>>> "The host 'powned.svr.bank.com' has '230' inbound flows, >>> exceeding the >>> >>>>> threshold of '202'". >>> >>>>> >>> >>>>> >>> >>>>> 2. *Triage Calculated Values from the Profiler* >>> >>>>> >>> >>>>> Also, the value being interrogated here, the number of inbound >>> flows, >>> >> is >>> >>>>> not a static value contained within any single telemetry message. >>> This >>> >>>>> value is calculated across multiple messages by the Profiler. The >>> >>>> current >>> >>>>> Threat Triage process cannot be used to interrogate values >>> calculated >>> >> by >>> >>>>> the Profiler. >>> >>>>> >>> >>>>> >>> >>>>> To try and keep this email concise and digestible, I am going to >>> send a >>> >>>>> follow-on discussing proposed solutions for each of these >>> separately. >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>> >>> >>>> >>> >>>> -- >>> >>>> Nick Allen <n...@nickallen.org> >>> >>>> >>> >> >>> >> >>> >>> >> >