Let's make sure we have a common understanding of the use case (there are likely many). What you mentioned was replaying historical data, which is very cool, but can mean a lot of different things (being that we all have very active imaginations).
Here are a few broad strokes of what I have been thinking about that relates to "replay". Hopefully, this relates to your thoughts in this thread and I am not taking us on tangent. (1) As a Security Data Scientist, I'd like to be able to replay historical pcap through my signature-based, IDS suite. Between the time when the pcap was captured and now (2, 8, 12 weeks), my signatures have been updated based on newly discovered threats in the wild. If I find newly generated alerts during replay that were not generated initially, then my systems were likely breached by (or at least exposed to) an advanced actor with access to a zero day vulnerability. Since exploitation can often take months, I still have time to react and mitigate the breach. (2) As a Security Data Scientist, I don't want to wait for a profile to be generated from data in real time. It is difficult to understand whether the profile I have created is (a) correct or (b) has any value to me unless I can see data from it over a span of time. If I have to wait for the profile to be generated in real-time this slows down my progress in performing exploratory analysis and model building. (3) As an Investigator, I need to create a profile to investigate ongoing suspicious activity. I often investigate incidents that began in the past and may or may not currently be active. I often don't know what I need to profile until responding to an active incident. If I could generate a profile from a starting point in the past, I might be able to understand how a security incident began, how it has spread, and what assets have been exposed. (4) As a Platform Engineer, I was given a model to deploy in production. The model needs data from a profile generated by the Profiler. I'd like instant feedback to know whether I deployed things correctly. If I could generate a profile from some point in the past, I could validate that the model and profile work on production data sooner. The model would also start functioning sooner. On Tue, Feb 28, 2017 at 8:03 AM, Justin Leet <justinjl...@gmail.com> wrote: > There's a couple JIRAs related to the use of system time vs event time. > > METRON-590 Enable Use of Event Time in Profiler > <https://issues.apache.org/jira/browse/METRON-590> > METRON-691 Elastic Writer index partitions on system time, not event time > <https://issues.apache.org/jira/browse/METRON-691> > > Is there anything else that needs to be making this distinction, and if so, > do we need to be able to support both system time and event time for it? > > My immediate thought on this is that, once we work on replaying historical > data, we'll want system time for geo data passing through. Given that the > geo files can update, we'd want to know which geo file we actually need to > be using at the appropriate time. > > We'll probably also want to double check anything else that writes out data > to a location and provides some sort of timestamping on it. > > Justin >