Great. I think we're thinking along the same lines. I just sent a follow-up of another proposal that takes this idea a little further. What if we treated the Profiler as another source of telemetry?
On Wed, Feb 1, 2017 at 2:23 PM, Casey Stella <ceste...@gmail.com> wrote: > Regarding point 2, could we enable the profiler to write data to kafka and > the enrichment queue? > > I'm proposing the profiler do something like this: > > - Count the number of inbound flows > - On the tick, send a message to the enrichment queue containing: > - the number of flows > - A source type of 'system_alert' > - is_alert set to true > - In enrichment, we enrich and triage system_alert source data in the > same way we do any other. > > This would not solve the transparency issue, but at least make it so we > keep triage in one place in the architecture. Also, enabling kafka writing > would enable other types of use-cases, like situations where we find > outliers *directly* in the profile and send the alerts directly to the > indexing queue without triage. > > The only changes this proposal would require would be > > 1. a "write" section to a profile that takes a list of stellar > statements and gets run on the tick write > 2. fixing the kafka writing stellar functions > > Casey > > On Wed, Feb 1, 2017 at 2:11 PM, Nick Allen <n...@nickallen.org> wrote: > > > I'd like to explore the functionality that we have in Metron using a > > motivating example. I think this will help highlight some gaps where we > > can enhance Metron. > > > > The motivating example is that I would like to create an alert if the > > number of inbound flows to any host over a 15 minute interval is > abnormal. > > I would like the alert to contain the specific information below to > > streamline the triage process. > > > > Rule: Abnormal number of inbound flows > > Bin: 15 mins > > Alert: The host 'powned.svr.bank.com' has '230' inbound flows, exceeding > > the threshold of '202' > > > > > > *What Works* > > > > In some ways, this example is similar to the "Outlier Detection" demo > that > > I performed with the Profiler a few months back. We have most of what > we > > need to do this with a couple caveats. > > > > 1. An enrichment would be added to enrich the message with the correct > > internal hostname 'powned.svr.bank.com'. > > > > 2. With the Profiler, I can capture some idea of what "normal" is for the > > number of inbound flows across 15 minute intervals. > > 3. With Threat Triage, I can create rules that alert when a value exceeds > > what the Profiler defines as normal. > > > > > > *What's Missing* > > > > Its nice to know that we are almost all the way there with this example. > > Unfortunately, there are two gaps that fall out of this. > > > > 1. *Threat Triage Transparency* > > > > There is little transparency into the Threat Triage process itself. When > > Threat Triage runs, all I get is a score. I don't know how that score > was > > arrived at, which rules were triggered, and the specific values that > caused > > a rule to trigger. > > > > More specifically, there is no way to generate a message that looks like > > "The host 'powned.svr.bank.com' has '230' inbound flows, exceeding the > > threshold of '202'". > > > > > > 2. *Triage Calculated Values from the Profiler* > > > > Also, the value being interrogated here, the number of inbound flows, is > > not a static value contained within any single telemetry message. This > > value is calculated across multiple messages by the Profiler. The > current > > Threat Triage process cannot be used to interrogate values calculated by > > the Profiler. > > > > > > To try and keep this email concise and digestible, I am going to send a > > follow-on discussing proposed solutions for each of these separately. > > > -- Nick Allen <n...@nickallen.org>