Re: [DISCUSS] Metron Parsers in Nifi

2018-08-09 Thread Justin Leet
I'll add onto Mike's discussion with the original set of requirements I had
in mind (and apply feedback on these as necessary!). This is largely
overlap with what Mike said, but I want to make sure it's clear where my
proposal was coming from, so we can improve on it as needed.  James and
Mike are also right, I think I skipped over the benefits of NiFi in general
a bit, so thanks for chiming in there.

- Deploy our bundled parsers without needing custom wrapping on all of them.
- Don't prevent ourselves from building custom wrapping as needed.
- Custom Java parsers with an easy way to hook in, similar to what we
already do in Storm.
- One stop (or at least one format) configuration, for the case when we're
doing some thing in NiFi (parsers) and some elsewhere (enrichment and
indexing). I don't think it'll always be "start in NiFi, end in Storm",
especially as we build out Stellar capability, but I also don't want users
learning a different set of configs and config tools for every platform we
run on.
- Ability to build out parsers and other systems fairly easily, e.g. Spark.
- Support our current use cases (in particular parser chaining as a more
advanced use case).

It really boils down to providing a relatively simple user path to be able
to migrate to NiFi as needed or desired as simply as possible in a very
general way, while not preventing parser by parser enhancements.

On Wed, Aug 8, 2018 at 7:14 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I think it also provides customers greater control over their architecture
> by giving them the flexibility to choose where/how to host their parsers.
>
> To Justin's point about the API, my biggest concern about the RecordReader
> approach is that it is not stable. We already have a similar problem in
> having the TransportClient in ElasticSearch - they are prone to changing it
> in minor versions with the advent of their newer REST API, which is
> problematic for ensuring a stable installation.
>
> From my own perspective, our goal with NiFi, at least in part, should be
> the ability to deploy our core parsing infrastructure, i.e.
>
>- pre-built parsers
>- custom java parsers
>- Stellar transforms
>- custom stellar transforms
>
> And have the ability to configure it similarly to how we configure parsers
> within Storm. Consistent with our recent parser chaining and aggregation
> feature, users should be able to construct and deploy similar constructs in
> NiFi. The core architectural shift would be that parser code should be
> platform agnostic. We provide the plumbing in Storm, NiFi, and  Streaming?, other> and platform architects and devops teams can choose how
> and where to deploy.
>
> Best,
> Mike
>
>
> On Wed, Aug 8, 2018 at 9:57 AM James Sirota  wrote:
>
> > Integration with NiFi would be useful for parsing low-volume telemetries
> > at the edge.  This is a much more resource friendly way to do it than
> > setting up dedicated storm topologies.  The integration would be that the
> > NiFi processor parses the data and pushes it straight into the enrichment
> > topic, saving us the resources of having multiple parsers in storm
> >
> > Thanks,
> > James
> >
> > 07.08.2018, 11:29, "Otto Fowler" :
> > > Why do we start over. We are going back and forth on implementation,
> and
> > I
> > > don’t think we have the same goals or concerns.
> > >
> > > What would be the requirements or goals of metron integration with
> Nifi?
> > > How many levels or options for integration do we have?
> > > What are the approaches to choose from?
> > > Who are the target users?
> > >
> > > On August 7, 2018 at 12:24:56, Justin Leet (justinjl...@gmail.com)
> > wrote:
> > >
> > > So how does the MetronRecordReader roll into everything? It seems like
> > it'd
> > > be more useful on the reader per format approach, but otherwise it
> > doesn't
> > > really seem like we gain much, and it requires getting everything
> linked
> > up
> > > properly to be used. Assuming we looked at doing it that way, is the
> idea
> > > that we'd setup a ControllerService with the MetronRecordReader and a
> > > MetronRecordWriter and then have the StellarTransformRecord processor
> > > configured with those ControllerServices? How do we manage the
> > > configurations of the everything that way? How does the
> ControllerService
> > > get configured with whatever parser(s) are needed in the flow?
> Basically,
> > > what's your vision for how everything would tie together?
> > >
> > > I also forgot to mention this in the original writeup, but there's
> > another
> > > reason to avoid the RecordReader: It's not considered stable. See
> > >
> >
> https://github.com/apache/nifi/blob/master/nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/RecordReader.java#L34
> > .
> > > That alone makes me super hesitant to use it, if it can shift out from
> > > under us in even in incremental version.
> > >
> > > I'm also unclear on why StellarTransformRecord processor ma

Re: [DISCUSS] Metron Parsers in Nifi

2018-08-09 Thread Otto Fowler
I reached out to the nifi list about the Record api ‘stability'

On August 9, 2018 at 09:54:22, Bryan Bende (bbe...@gmail.com) wrote:

I don't think there are any stability issues with the record API, it
is definitely recommended to use the record approach where it makes
sense.

That comment was probably put there on the first release and never
removed, and now it has been 4-5 releases later.

As a general comment to APIs, the record stuff is part of a controller
service API, and not part of the framework API, so I do think there is
more freedom to change the API on minor releases if needed, however I
don't see any major changes to the record stuff happening.

On Thu, Aug 9, 2018 at 5:58 AM, Mike Thomsen 
wrote:
> I think that comment is no longer valid. Heck PutHBaseRecord started as
> part of a project at my company in early 2017 and we found it perfectly
> stable back then.
> On Wed, Aug 8, 2018 at 11:46 PM Otto Fowler 
wrote:
>
>> I’m seeing
>>
>>
https://github.com/apache/nifi/blob/master/nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/RecordReader.java#L34

>> being quoted as a reason to NOT build Record based processors but
instead
>> stick with the original Processor api.
>>
>> Yet, on list and on hipchat and in pr’s I’ve seen the Record approach
being
>> promoted heavily.
>>
>> Is this comment still correct? Is the API not considered stable?
>> Would the NiFi project recommend building externally hosted NiFi
components
>> using the Record API?
>>
>> ottO
>>



On August 9, 2018 at 09:14:58, Justin Leet (justinjl...@gmail.com) wrote:

I'll add onto Mike's discussion with the original set of requirements I had
in mind (and apply feedback on these as necessary!). This is largely
overlap with what Mike said, but I want to make sure it's clear where my
proposal was coming from, so we can improve on it as needed. James and
Mike are also right, I think I skipped over the benefits of NiFi in general
a bit, so thanks for chiming in there.

- Deploy our bundled parsers without needing custom wrapping on all of
them.
- Don't prevent ourselves from building custom wrapping as needed.
- Custom Java parsers with an easy way to hook in, similar to what we
already do in Storm.
- One stop (or at least one format) configuration, for the case when we're
doing some thing in NiFi (parsers) and some elsewhere (enrichment and
indexing). I don't think it'll always be "start in NiFi, end in Storm",
especially as we build out Stellar capability, but I also don't want users
learning a different set of configs and config tools for every platform we
run on.
- Ability to build out parsers and other systems fairly easily, e.g. Spark.
- Support our current use cases (in particular parser chaining as a more
advanced use case).

It really boils down to providing a relatively simple user path to be able
to migrate to NiFi as needed or desired as simply as possible in a very
general way, while not preventing parser by parser enhancements.

On Wed, Aug 8, 2018 at 7:14 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I think it also provides customers greater control over their
architecture
> by giving them the flexibility to choose where/how to host their parsers.
>
> To Justin's point about the API, my biggest concern about the
RecordReader
> approach is that it is not stable. We already have a similar problem in
> having the TransportClient in ElasticSearch - they are prone to changing
it
> in minor versions with the advent of their newer REST API, which is
> problematic for ensuring a stable installation.
>
> From my own perspective, our goal with NiFi, at least in part, should be
> the ability to deploy our core parsing infrastructure, i.e.
>
> - pre-built parsers
> - custom java parsers
> - Stellar transforms
> - custom stellar transforms
>
> And have the ability to configure it similarly to how we configure
parsers
> within Storm. Consistent with our recent parser chaining and aggregation
> feature, users should be able to construct and deploy similar constructs
in
> NiFi. The core architectural shift would be that parser code should be
> platform agnostic. We provide the plumbing in Storm, NiFi, and  Streaming?, other> and platform architects and devops teams can choose
how
> and where to deploy.
>
> Best,
> Mike
>
>
> On Wed, Aug 8, 2018 at 9:57 AM James Sirota  wrote:
>
> > Integration with NiFi would be useful for parsing low-volume
telemetries
> > at the edge. This is a much more resource friendly way to do it than
> > setting up dedicated storm topologies. The integration would be that
the
> > NiFi processor parses the data and pushes it straight into the
enrichment
> > topic, saving us the resources of having multiple parsers in storm
> >
> > Thanks,
> > James
> >
> > 07.08.2018, 11:29, "Otto Fowler" :
> > > Why do we start over. We are going back and forth on implementation,
> and
> > I
> > > don’t think we have the same goals or concerns.
> > >
> > > What would be the requirements or g

Re: [DISCUSS] Metron Parsers in Nifi

2018-08-09 Thread Otto Fowler
I think the benefits are clear.  What is unclear is if the goal is to
expose or share or re-use Metron capabilities ( stellar, parsing ) in nifi
in a way that is native to nifi ( configured and managed in nifi ), where
you may not even need metron ( say you just want to parse asa ) or if the
goal is to have a hybrid approach coupling the processors/readers to the
metron installation.


On August 9, 2018 at 09:14:58, Justin Leet (justinjl...@gmail.com) wrote:

I'll add onto Mike's discussion with the original set of requirements I had
in mind (and apply feedback on these as necessary!). This is largely
overlap with what Mike said, but I want to make sure it's clear where my
proposal was coming from, so we can improve on it as needed. James and
Mike are also right, I think I skipped over the benefits of NiFi in general
a bit, so thanks for chiming in there.

- Deploy our bundled parsers without needing custom wrapping on all of
them.
- Don't prevent ourselves from building custom wrapping as needed.
- Custom Java parsers with an easy way to hook in, similar to what we
already do in Storm.
- One stop (or at least one format) configuration, for the case when we're
doing some thing in NiFi (parsers) and some elsewhere (enrichment and
indexing). I don't think it'll always be "start in NiFi, end in Storm",
especially as we build out Stellar capability, but I also don't want users
learning a different set of configs and config tools for every platform we
run on.
- Ability to build out parsers and other systems fairly easily, e.g. Spark.
- Support our current use cases (in particular parser chaining as a more
advanced use case).

It really boils down to providing a relatively simple user path to be able
to migrate to NiFi as needed or desired as simply as possible in a very
general way, while not preventing parser by parser enhancements.

On Wed, Aug 8, 2018 at 7:14 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> I think it also provides customers greater control over their
architecture
> by giving them the flexibility to choose where/how to host their parsers.
>
> To Justin's point about the API, my biggest concern about the
RecordReader
> approach is that it is not stable. We already have a similar problem in
> having the TransportClient in ElasticSearch - they are prone to changing
it
> in minor versions with the advent of their newer REST API, which is
> problematic for ensuring a stable installation.
>
> From my own perspective, our goal with NiFi, at least in part, should be
> the ability to deploy our core parsing infrastructure, i.e.
>
> - pre-built parsers
> - custom java parsers
> - Stellar transforms
> - custom stellar transforms
>
> And have the ability to configure it similarly to how we configure
parsers
> within Storm. Consistent with our recent parser chaining and aggregation
> feature, users should be able to construct and deploy similar constructs
in
> NiFi. The core architectural shift would be that parser code should be
> platform agnostic. We provide the plumbing in Storm, NiFi, and  Streaming?, other> and platform architects and devops teams can choose
how
> and where to deploy.
>
> Best,
> Mike
>
>
> On Wed, Aug 8, 2018 at 9:57 AM James Sirota  wrote:
>
> > Integration with NiFi would be useful for parsing low-volume
telemetries
> > at the edge. This is a much more resource friendly way to do it than
> > setting up dedicated storm topologies. The integration would be that
the
> > NiFi processor parses the data and pushes it straight into the
enrichment
> > topic, saving us the resources of having multiple parsers in storm
> >
> > Thanks,
> > James
> >
> > 07.08.2018, 11:29, "Otto Fowler" :
> > > Why do we start over. We are going back and forth on implementation,
> and
> > I
> > > don’t think we have the same goals or concerns.
> > >
> > > What would be the requirements or goals of metron integration with
> Nifi?
> > > How many levels or options for integration do we have?
> > > What are the approaches to choose from?
> > > Who are the target users?
> > >
> > > On August 7, 2018 at 12:24:56, Justin Leet (justinjl...@gmail.com)
> > wrote:
> > >
> > > So how does the MetronRecordReader roll into everything? It seems
like
> > it'd
> > > be more useful on the reader per format approach, but otherwise it
> > doesn't
> > > really seem like we gain much, and it requires getting everything
> linked
> > up
> > > properly to be used. Assuming we looked at doing it that way, is the
> idea
> > > that we'd setup a ControllerService with the MetronRecordReader and a
> > > MetronRecordWriter and then have the StellarTransformRecord processor
> > > configured with those ControllerServices? How do we manage the
> > > configurations of the everything that way? How does the
> ControllerService
> > > get configured with whatever parser(s) are needed in the flow?
> Basically,
> > > what's your vision for how everything would tie together?
> > >
> > > I also forgot to mention this in the original wri

Re: [DISCUSS] Metron Parsers in Nifi

2018-08-09 Thread Justin Leet
That's definitely good info, thanks for reaching out to them about it.

In terms of exposing/sharing, I don't think we have to couple them tightly
(in fact, I think we should loosen the coupling as much as possible without
forcing reimplementation of things). I think there's definitely a way to do
that terms of the general purpose processor I proposed (or in terms of
RecordReader or another implementation).

It would definitely be easy enough to configure it to either pull from ZK
or to use a parser config json extract as a parameter (to maintain the same
formatting and make migration easy).  And we can still build specific
NiFi-oriented parsers as needed (that manage things like Schema via the
registry and other Nifi mechanisms).  This keeps parsers entirely decoupled
from a metron installation.

Alternatively, we extract our config handling to a module and scripts we
can package up and easily deploy configs against ZK (or the maybe Nifi's
StateController's or whatever).  We definitely shouldn't need absolutely
everything installed to be able to run just parsers on Nifi.

Having said that, right now the easiest way we have to maintain on the fly
updatable configs (and updatable is important!) is via ZK.  Params in Nifi
aren't quite that flexible, to the best of my knowledge (i.e. you have to
stop, update config and restart). We might be able to exploit the
StateController to manage this for us, but I'm honestly not familiar enough
with it and for deployments split between NiFi and Storm, it means
configuration gets managed in a couple different ways (which may with users
since there is a fairly brightline delineation which makes it easier to
accept).  There some complicated configs like fieldTransforms, which is
part of why I would like things to be configured in the same format (if not
the same mechanism).

Ideally, in my mind, the parsers shared between both NiFi and Storm just
implement the very general MessageParser interface (which is pretty
minimal, a couple setup methods, validation, and the actual parse).  This
is pretty lightweight and the split of metron-parsers into
metron-parsers-common et al. would loosen the coupling between parsers and
the rest of metron into that core needed to support that.

IMO, at that point, we'd have a pretty minimal NAR (or NARs depending on
config management) that lets us run our set of parsers, lets users build
new parsers (and don't block specialized NiFi implementations that exploit
NiFi's feature set), and lets us get things configured in a relatively
consistent manner, without losing features, and hopefully requiring a
pretty minimal slice of Metron to be useful.

On Thu, Aug 9, 2018 at 10:06 AM Otto Fowler  wrote:

> I think the benefits are clear.  What is unclear is if the goal is to
> expose or share or re-use Metron capabilities ( stellar, parsing ) in nifi
> in a way that is native to nifi ( configured and managed in nifi ), where
> you may not even need metron ( say you just want to parse asa ) or if the
> goal is to have a hybrid approach coupling the processors/readers to the
> metron installation.
>
>
> On August 9, 2018 at 09:14:58, Justin Leet (justinjl...@gmail.com) wrote:
>
> I'll add onto Mike's discussion with the original set of requirements I
> had
> in mind (and apply feedback on these as necessary!). This is largely
> overlap with what Mike said, but I want to make sure it's clear where my
> proposal was coming from, so we can improve on it as needed. James and
> Mike are also right, I think I skipped over the benefits of NiFi in
> general
> a bit, so thanks for chiming in there.
>
> - Deploy our bundled parsers without needing custom wrapping on all of
> them.
> - Don't prevent ourselves from building custom wrapping as needed.
> - Custom Java parsers with an easy way to hook in, similar to what we
> already do in Storm.
> - One stop (or at least one format) configuration, for the case when we're
> doing some thing in NiFi (parsers) and some elsewhere (enrichment and
> indexing). I don't think it'll always be "start in NiFi, end in Storm",
> especially as we build out Stellar capability, but I also don't want users
> learning a different set of configs and config tools for every platform we
> run on.
> - Ability to build out parsers and other systems fairly easily, e.g.
> Spark.
> - Support our current use cases (in particular parser chaining as a more
> advanced use case).
>
> It really boils down to providing a relatively simple user path to be able
> to migrate to NiFi as needed or desired as simply as possible in a very
> general way, while not preventing parser by parser enhancements.
>
> On Wed, Aug 8, 2018 at 7:14 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > I think it also provides customers greater control over their
> architecture
> > by giving them the flexibility to choose where/how to host their
> parsers.
> >
> > To Justin's point about the API, my biggest concern about the
> RecordReader
> > approach is 

Good press for Metron!

2018-08-09 Thread Casey Stella
https://www.darkreading.com/endpoint/oh-no-not-another-security-product/a/d-id/1332453


Re: Good press for Metron!

2018-08-09 Thread Vets, Laurens
I was just reading this, see the IRC channel :)

On 09-Aug-18 08:21, Casey Stella wrote:
> https://www.darkreading.com/endpoint/oh-no-not-another-security-product/a/d-id/1332453



Re: [DISCUSS] Metron Parsers in Nifi

2018-08-09 Thread Otto Fowler
I would say that

- For each configuration parameter we want to pull in, it should be
explicitly configured through a property as well as through a controller
service that accesses the metron zk
- Transformations should not be conflated with parsing in those processors
or readers

There is no on the fly configuration change in nifi ( You can’t change
properties once started ).

Wouldn’t the simplest minimal start be to say that we expect either nifi or
metron and simplify things?  Let nifi nifi, let metron metron.


On August 9, 2018 at 10:53:24, Justin Leet (justinjl...@gmail.com) wrote:

That's definitely good info, thanks for reaching out to them about it.

In terms of exposing/sharing, I don't think we have to couple them tightly
(in fact, I think we should loosen the coupling as much as possible without
forcing reimplementation of things). I think there's definitely a way to do
that terms of the general purpose processor I proposed (or in terms of
RecordReader or another implementation).

It would definitely be easy enough to configure it to either pull from ZK
or to use a parser config json extract as a parameter (to maintain the same
formatting and make migration easy).  And we can still build specific
NiFi-oriented parsers as needed (that manage things like Schema via the
registry and other Nifi mechanisms).  This keeps parsers entirely decoupled
from a metron installation.

Alternatively, we extract our config handling to a module and scripts we
can package up and easily deploy configs against ZK (or the maybe Nifi's
StateController's or whatever).  We definitely shouldn't need absolutely
everything installed to be able to run just parsers on Nifi.

Having said that, right now the easiest way we have to maintain on the fly
updatable configs (and updatable is important!) is via ZK.  Params in Nifi
aren't quite that flexible, to the best of my knowledge (i.e. you have to
stop, update config and restart). We might be able to exploit the
StateController to manage this for us, but I'm honestly not familiar enough
with it and for deployments split between NiFi and Storm, it means
configuration gets managed in a couple different ways (which may with users
since there is a fairly brightline delineation which makes it easier to
accept).  There some complicated configs like fieldTransforms, which is
part of why I would like things to be configured in the same format (if not
the same mechanism).

Ideally, in my mind, the parsers shared between both NiFi and Storm just
implement the very general MessageParser interface (which is pretty
minimal, a couple setup methods, validation, and the actual parse).  This
is pretty lightweight and the split of metron-parsers into
metron-parsers-common et al. would loosen the coupling between parsers and
the rest of metron into that core needed to support that.

IMO, at that point, we'd have a pretty minimal NAR (or NARs depending on
config management) that lets us run our set of parsers, lets users build
new parsers (and don't block specialized NiFi implementations that exploit
NiFi's feature set), and lets us get things configured in a relatively
consistent manner, without losing features, and hopefully requiring a
pretty minimal slice of Metron to be useful.

On Thu, Aug 9, 2018 at 10:06 AM Otto Fowler  wrote:

> I think the benefits are clear.  What is unclear is if the goal is to
> expose or share or re-use Metron capabilities ( stellar, parsing ) in nifi
> in a way that is native to nifi ( configured and managed in nifi ), where
> you may not even need metron ( say you just want to parse asa ) or if the
> goal is to have a hybrid approach coupling the processors/readers to the
> metron installation.
>
>
> On August 9, 2018 at 09:14:58, Justin Leet (justinjl...@gmail.com) wrote:
>
> I'll add onto Mike's discussion with the original set of requirements I had
> in mind (and apply feedback on these as necessary!). This is largely
> overlap with what Mike said, but I want to make sure it's clear where my
> proposal was coming from, so we can improve on it as needed. James and
> Mike are also right, I think I skipped over the benefits of NiFi in general
> a bit, so thanks for chiming in there.
>
> - Deploy our bundled parsers without needing custom wrapping on all of
> them.
> - Don't prevent ourselves from building custom wrapping as needed.
> - Custom Java parsers with an easy way to hook in, similar to what we
> already do in Storm.
> - One stop (or at least one format) configuration, for the case when we're
> doing some thing in NiFi (parsers) and some elsewhere (enrichment and
> indexing). I don't think it'll always be "start in NiFi, end in Storm",
> especially as we build out Stellar capability, but I also don't want users
> learning a different set of configs and config tools for every platform we
> run on.
> - Ability to build out parsers and other systems fairly easily, e.g. Spark.
> - Support our current use cases (in particular parser chaining as a more
> advanc