It should also be smart enough to handle an order like: source("bro") -> parser("BasicBroParser") -> exists("ip_src_addr") -> geo_ip_src = geo["ip_src_addr"] -> application = assets["ip_src_addr"].application -> owner = assets["ip_src_addr"].owner -> exists("ip_dst_addr") -> geo_ip_dst = geo["ip_dst_addr"] -> elasticsearch("bro-index")
Without duplicate hits of the topologies. Jon On Thu, Oct 6, 2016 at 1:55 PM Nick Allen <n...@nickallen.org> wrote: > Here is quick example with some hypothetical syntax. Whatever that syntax > might be, it would be very simple, easy to understand, and leverage > high-level concepts specific to Metron. > > This flow consumes Bro data, ensures there are valid source/destination > IPs, performs geo-enrichment, asset enrichment and finally persists the > data in Elasticsearch. > > > source("bro") > -> parser("BasicBroParser") > -> exists("ip_src_addr") > -> exists("ip_dst_addr") > -> geo_ip_src = geo["ip_src_addr"] > -> geo_ip_dst = geo["ip_dst_addr"] > -> application = assets["ip_src_addr"].application > -> owner = assets["ip_src_addr"].owner > -> elasticsearch("bro-index") > > > > > On Thu, Oct 6, 2016 at 12:58 PM, Nick Allen <n...@nickallen.org> wrote: > > > Chasing this bad idea down even further leads me to something even > > crazier. > > > > Stellar 1.0 can only operate within a single topology and in most cases > > only on a single message. Stellar 2.0 could be the mechanism that allows > > users to define their own data flows and what "useful bits of Metron > > functionality" get plugged-in. > > > > Once, you have a DSL that allows users to define what they want Metron to > > do, then the underlying implementation mechanism (which is currently > Storm) > > can also be swapped-out. If we have an even faster Storm implementation, > > then we swap in the Storm NG engine. Maybe we want Metron to also run in > > Flink, then we just swap-in a Flink engine. > > > > > > > > > > On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen <n...@nickallen.org> wrote: > > > >> I totally "bird dogged the previous thread" as Casey likes to call it. > :) > >> I am extracting this thought into a separate thread before I start > >> throwing out even more, crazier ideas. > >> > >> In general, Metron is very opinionated about data flows right now. We > >>> have Parser topologies that feed an Enrichment topology, which then > feeds > >>> an Indexing topology. We have useful bits of functionality (think > Stellar > >>> transforms, Geo enrichment, etc) that are closely coupled with these > >>> topologies (aka data flows). > >>> > >> > >> > >>> When a user wants to parse heterogenous data from a single topic, > that's > >>> not easy. When a user wants enriched output to land in unique topics > by > >>> sensor type, well, that's also not easy. When a user wanted to skip > >>> enrichment of data sources, we actually re-architected the data flow > to add > >>> the Indexing topology. > >>> > >> > >> > >>> In an ideal world, a user should be responsible for defining the data > >>> flow, not Metron. Metron should provide the "useful bits of > functionality" > >>> that a user can "plugin" wherever they like. Metron itself should not > care > >>> how the data is moving or what step in the process it is at. > >> > >> > >> > >> > >> -- > >> Nick Allen <n...@nickallen.org> > >> > > > > > > > > -- > > Nick Allen <n...@nickallen.org> > > > > > > -- > Nick Allen <n...@nickallen.org> > -- Jon