Re: PULL ProvenanceEvent

2019-11-06 Thread Adam Taft
+1 Joe - this is a good compromise to keep the original API undisturbed. On Wed, Nov 6, 2019 at 11:05 AM Joe Witt wrote: > Nissim > > Notionally I am saying that session.getProvenanceReporter().receive(...) > should have an option to call > session.getProvenanceReporter().receive(...,ACTIVE|PAS

Re: PULL ProvenanceEvent

2019-11-06 Thread Nissim Shiman
Joe, Very nice... Thanks! Nissim On Wednesday, November 6, 2019, 1:05:09 PM EST, Joe Witt wrote: Nissim Notionally I am saying that session.getProvenanceReporter().receive(...) should have an option to call session.getProvenanceReporter().receive(...,ACTIVE|PASSIVE) and if not speci

Re: PULL ProvenanceEvent

2019-11-06 Thread Joe Witt
Nissim Notionally I am saying that session.getProvenanceReporter().receive(...) should have an option to call session.getProvenanceReporter().receive(...,ACTIVE|PASSIVE) and if not specified it would be UNSPECIFIED. I dont think this needs to be on the flowfile attribute - it would go straight to

Re: PULL ProvenanceEvent

2019-11-06 Thread Nissim Shiman
Joe, Just to verify what you mean, You are saying that the line: flowfile = session.putAttribute(flowfile, "receiveType", "active") could be added before session.getProvenanceReporter().receive(...) to indicate a PULL.  Is this correct? Thanks, Nissim On Monday, November 4, 2019,

Re: PULL ProvenanceEvent

2019-11-04 Thread Nissim Shiman
Having an attribute added indicating passive/active/query for RECEIVE and FETCH will work,  but nifi attributes are stateful (i.e. they will still be on the flowfile as metadata a couple of processor steps down the flow) Maybe an option is to expand the the api for RECEIVE and FETCH for with a

Re: PULL ProvenanceEvent

2019-10-31 Thread Joe Witt
These distinctions may be meaningful. Adding them as an attribute lets the meaning convey but not introduce complexity for the majority case which is the distinction isnt key. thanks On Thu, Oct 31, 2019 at 4:05 PM Nissim Shiman wrote: > Mike, > I like the QUERY type as well. Basically a mor

Re: PULL ProvenanceEvent

2019-10-31 Thread Nissim Shiman
Mike, I like the QUERY type as well.  Basically a more refined PULL.  Very nice. Part of the challenge of adding PULL as a type is that there are currently two flavors of RECEIVEs.  RECEIVE and FETCH [1] So any addition of a PULL would need a second flavor of PULL to match the case where a f

Re: PULL ProvenanceEvent

2019-10-30 Thread Mike Thomsen
I like the idea of creating PULL as a type. In fact, I'd propose that there are three scenarios here: RECEIVE - Passively acquire in a sort of hand-off situation. Ex: Kafka subscription PULL - Direct operations to seek out and fetch something in a targeted fashion. Ex. GetHttp QUERY - Go looking f

Re: PULL ProvenanceEvent

2019-10-30 Thread Nissim Shiman
Joe, It is hard to say how much value transit URI would bring to clarify a RECEIVE. For example a RECEIVE with transit URI of https: could be either a GetHTTP (i.e. active) or ListenHTTP (i.e. passive) but your idea of "a metadata item specifying active vs passive" is a very clever way to ma

Re: PULL ProvenanceEvent

2019-10-29 Thread Joe Witt
Nissim I like the idea to introduce a more refined type of event for how data is brought into nifi (active - PULL, passive - RECEIVE). That said it might be sufficient to simply have this distinction be on the "RECEIVE" event as a metadata item specifying active vs passive. The protocol utilized

Re: PULL ProvenanceEvent

2019-10-29 Thread Nissim Shiman
Adam, good points... I missed a step in explaining the use case where Provenance Events is incomplete... Where the second nifi does a GetSFTP from the *filesytem* that the first nifi is located on So the second nifi currently sends a RECEIVE event, but there is no corresponding SEND event from

Re: PULL ProvenanceEvent

2019-10-28 Thread Adam Taft
> But a flowfile that was PULLed by the second nifi (from the first nifi) will not necessarily have any provenance event generated by the first nifi. Isn't this the fault of the first NiFi to fail to emit a SEND event in response to the second NiFi's request? In this scenario, shouldn't the send/

Re: PULL ProvenanceEvent

2019-10-28 Thread Nissim Shiman
Adam, I believe there is a need for more detailed ProvenanceEvents. A use case would be a customer that is trying to track data passed between two nifi's and trying to match up SENDs and RECEIVEs So a flowfile that has a SEND event on the first nifi should have a RECEIVE event on the second nif

Re: PULL ProvenanceEvent

2019-10-11 Thread Nissim Shiman
Adam, "Yes" to your first question and the four processor examples you listed. I will need to get back to you regarding your other points. Thanks, Nissim On Thursday, October 10, 2019, 7:05:57 PM EDT, Adam Taft wrote: Nissim, Just to be clear, you are trying to distinguish between p

Re: PULL ProvenanceEvent

2019-10-10 Thread Adam Taft
Nissim, Just to be clear, you are trying to distinguish between processors which are actively "pulling" data (GetXYZ) vs. processors which just "listen" for data (ListenXYZ)? Is that your basic vision? GetFile => PULL GetHTTP => PULL ListenHTTP => RECEIVE ListenTCP => RECEIVE Could you clarify

PULL ProvenanceEvent

2019-10-10 Thread Nissim Shiman
Hello Team, The ProvenanceEventType class does a good job capturing possible events, but the PULL event doesn't seem to fall nicely into any of the existing types. https://gitbox.apache.org/repos/asf?p=nifi.git;a=blob;f=nifi-api/src/main/java/org/apache/nifi/provenance/ProvenanceEventType.java RE