Let's make sure we have a common understanding of the use case (there are
likely many).  What you mentioned was replaying historical data, which is
very cool, but can mean a lot of different things (being that we all have
very active imaginations).

Here are a few broad strokes of what I have been thinking about that
relates to "replay".  Hopefully, this relates to your thoughts in this
thread and I am not taking us on tangent.

(1) As a Security Data Scientist, I'd like to be able to replay historical
pcap through my signature-based, IDS suite.  Between the time when the pcap
was captured and now (2, 8, 12 weeks), my signatures have been updated
based on newly discovered threats in the wild.  If I find newly generated
alerts during replay that were not generated initially, then my systems
were likely breached by (or at least exposed to) an advanced actor with
access to a zero day vulnerability.  Since exploitation can often take
months, I still have time to react and mitigate the breach.

(2) As a Security Data Scientist, I don't want to wait for a profile to be
generated from data in real time.  It is difficult to understand whether
the profile I have created is (a) correct or (b) has any value to me unless
I can see data from it over a span of time.  If I have to wait for the
profile to be generated in real-time this slows down my progress in
performing exploratory analysis and model building.


(3) As an Investigator, I need to create a profile to investigate ongoing
suspicious activity.  I often investigate incidents that began in the past
and may or may not currently be active.  I often don't know what I need to
profile until responding to an active incident.  If I could generate a
profile from a starting point in the past, I might be able to understand
how a security incident began, how it has spread, and what assets have been
exposed.

(4) As a Platform Engineer, I was given a model to deploy in production.
The model needs data from a profile generated by the Profiler.  I'd like
instant feedback to know whether I deployed things correctly.  If I could
generate a profile from some point in the past, I could validate that the
model and profile work on production data sooner.  The model would also
start functioning sooner.








On Tue, Feb 28, 2017 at 8:03 AM, Justin Leet <justinjl...@gmail.com> wrote:

> There's a couple JIRAs related to the use of system time vs event time.
>
> METRON-590 Enable Use of Event Time in Profiler
> <https://issues.apache.org/jira/browse/METRON-590>
> METRON-691 Elastic Writer index partitions on system time, not event time
> <https://issues.apache.org/jira/browse/METRON-691>
>
> Is there anything else that needs to be making this distinction, and if so,
> do we need to be able to support both system time and event time for it?
>
> My immediate thought on this is that, once we work on replaying historical
> data, we'll want system time for geo data passing through.  Given that the
> geo files can update, we'd want to know which geo file we actually need to
> be using at the appropriate time.
>
> We'll probably also want to double check anything else that writes out data
> to a location and provides some sort of timestamping on it.
>
> Justin
>

Reply via email to