Re: [Discuss] Improve Alerting

Nick Allen Wed, 01 Feb 2017 11:35:39 -0800

Great.  I think we're thinking along the same lines.  I just sent a
follow-up of another proposal that takes this idea a little further.  What
if we treated the Profiler as another source of telemetry?


On Wed, Feb 1, 2017 at 2:23 PM, Casey Stella <ceste...@gmail.com> wrote:

> Regarding point 2, could we enable the profiler to write data to kafka and
> the enrichment queue?
>
> I'm proposing the profiler do something like this:
>
>    - Count the number of inbound flows
>    - On the tick, send a message to the enrichment queue containing:
>       - the number of flows
>       - A source type of 'system_alert'
>       - is_alert set to true
>    - In enrichment, we enrich and triage system_alert source data in the
>    same way we do any other.
>
> This would not solve the transparency issue, but at least make it so we
> keep triage in one place in the architecture.  Also, enabling kafka writing
> would enable other types of use-cases, like situations where we find
> outliers *directly* in the profile and send the alerts directly to the
> indexing queue without triage.
>
> The only changes this proposal would require would be
>
>    1. a "write" section to a profile that takes a list of stellar
>    statements and gets run on the tick write
>    2. fixing the kafka writing stellar functions
>
> Casey
>
> On Wed, Feb 1, 2017 at 2:11 PM, Nick Allen <n...@nickallen.org> wrote:
>
> > I'd like to explore the functionality that we have in Metron using a
> > motivating example.  I think this will help highlight some gaps where we
> > can enhance Metron.
> >
> > The motivating example is that I would like to create an alert if the
> > number of inbound flows to any host over a 15 minute interval is
> abnormal.
> > I would like the alert to contain the specific information below to
> > streamline the triage process.
> >
> > Rule: Abnormal number of inbound flows
> > Bin: 15 mins
> > Alert: The host 'powned.svr.bank.com' has '230' inbound flows, exceeding
> > the threshold of '202'
> >
> >
> > *What Works*
> >
> > In some ways, this example is similar to the "Outlier Detection" demo
> that
> > I performed with the Profiler a few months back.   We have most of what
> we
> > need to do this with a couple caveats.
> >
> > 1. An enrichment would be added to enrich the message with the correct
> > internal hostname 'powned.svr.bank.com'.
> >
> > 2. With the Profiler, I can capture some idea of what "normal" is for the
> > number of inbound flows across 15 minute intervals.
> > 3. With Threat Triage, I can create rules that alert when a value exceeds
> > what the Profiler defines as normal.
> >
> >
> > *What's Missing*
> >
> > Its nice to know that we are almost all the way there with this example.
> > Unfortunately, there are two gaps that fall out of this.
> >
> >  1. *Threat Triage Transparency*
> >
> > There is little transparency into the Threat Triage process itself.  When
> > Threat Triage runs, all I get is a score.  I don't know how that score
> was
> > arrived at, which rules were triggered, and the specific values that
> caused
> > a rule to trigger.
> >
> > More specifically, there is no way to generate a message that looks like
> > "The host 'powned.svr.bank.com' has '230' inbound flows, exceeding the
> > threshold of '202'".
> >
> >
> > 2. *Triage Calculated Values from the Profiler*
> >
> > Also, the value being interrogated here, the number of inbound flows, is
> > not a static value contained within any single telemetry message.  This
> > value is calculated across multiple messages by the Profiler.  The
> current
> > Threat Triage process cannot be used to interrogate values calculated by
> > the Profiler.
> >
> >
> > To try and keep this email concise and digestible, I am going to send a
> > follow-on discussing proposed solutions for each of these separately.
> >
>



-- 
Nick Allen <n...@nickallen.org>

Re: [Discuss] Improve Alerting

Reply via email to