Re: Wait from multiple inputs before ending the flow

2015-12-12 Thread Juan Sequeiros
Good afternoon,

All processors have a scheduling tab or run every X time. I would set it at
your last invokeHttp.

If time is important look at the prioritizer settings too. So that you can
send through as example a more important flowFile.
On Dec 12, 2015 13:48, "Louis-Étienne Dorval"  wrote:

> Hi again,
>
>
> The MergeContent works perfectly for my case! The flow I've described in
> the previous email changed a bit, but still it's working as expected.
>
> The only problem, now NiFi is much faster than the existing system running
> in parallel (which is really good). I've done a "retry loop" describe below
> but still it's too fast :
>
> InvokeHttp -- (failure) --> UpdateAttribute (increment a counter) -->
> RouteAtttribute (if lower than X will retry) --> InvokeHttp
>
>
> Question: Is there something that already exist which could "sleep" a
> FlowFile for X seconds before continuing?
>
>
> Best regards and great job on the version 0.4.0, the Syslog feature is
> much appreciated!
> Louis-Etienne
>
> PS: Let me know if I should have started a new email thread with that
> question.
>
>
> On 6 December 2015 at 23:30, Joe Witt  wrote:
>
>> Louis-Etienne,
>>
>> NIFI-190 isn't scheduled on anything as of yet.  We had some design
>> questions/ideas and your example informs it even further.
>>
>> I think the custom proc method you mention will work out well.
>> Ultimately there will need to be one anyway to deal with the logic of
>> merging this particular format+schema.
>>
>> Thanks
>> Joe
>>
>> On Sun, Dec 6, 2015 at 11:28 PM,   wrote:
>> > Joe,
>> >
>> > Thanks for the prompt reply.
>> > About the merge, both message will be JSON and I need some specific
>> part from both.
>> > I'll recheck the doc to see what my options are, but I think that using
>> FlowFile Streams and a custom processor that would do the logic might be
>> good
>> >
>> > About the HoldProcessor, you must talk about NIFI-190. The way you
>> describe it seems to what I'm looking for
>> > But in the JIRA and looking quickly at the PR it seems like I would
>> lose the message from Topic2.
>> >
>> > I'll dig in the code of the PR and the MergeContent processor in order
>> to have a better understanding.
>> >
>> > Was that JIRA scheduled for a specific milestone? It would probably be
>> a great addition but maybe it require a lot of change that I dont see yet
>> >
>> > Regards,
>> > Louis-Etienne
>> >
>> >> On 6, 2015, at 9:42 PM, Joe Witt  wrote:
>> >>
>> >> Louis-Etienne,
>> >>
>> >> My initial thought is your idea with MergeContent is the right one.
>> >> However, the issue there is not just the combining of the data but the
>> >> 'what does merging truly mean in that case'.  So it is a bit undefined
>> >> what the next step will be.  Merge the content?  If so, how?  What is
>> >> the format and schema of the objects before the merge and after?
>> >>
>> >> Another member of the community had an idea for a concept of a
>> >> HoldProcessor.  It would allow these sorts of multi-object gates to
>> >> occur.  The same issue exists of what to do once the gate criteria is
>> >> hit but at that point you'd have more control over it.  MergeContent
>> >> is an already prescribed set of behaviors whereas HoldContent would
>> >> let you choose the next gate.  We really should get on with helping
>> >> get that contribution in.
>> >>
>> >> Thanks
>> >> Joe
>> >>
>> >>> On Sun, Dec 6, 2015 at 9:35 PM, Louis-Étienne Dorval <
>> ledor...@gmail.com> wrote:
>> >>> Hi everyone!
>> >>>
>> >>> I'm very excited to start using NiFi and I think that it will be very
>> >>> usefull for a some projects.
>> >>>
>> >>> I've been playing with it for some times and did a few basic flow,
>> but I'm
>> >>> having a hard time figuring how to achieve a part of my flow or if
>> NiFi will
>> >>> be able to do it.
>> >>> I'm building a flow around existing systems, so NiFi would run in
>> parallel
>> >>> of that and gather the output of these systems (everything is
>> asynchronous)
>> >>> to take actions.
>> >>>
>> >>> Everything starts with a GetJMSTopic on Topic1, then follows 2-3
>> processor
>> >>> that does Attribute Extractions.
>> >>> During that time the existing system will process the same message,
>> enrich
>> >>> the message (but also remove some usefull information) and will
>> publish on
>> >>> Topic2.
>> >>> I need the message from Topic2, so I've added another GetJMSTopic on
>> Topic2.
>> >>> Then I need to somehow take the FlowFile from Topic1 and from Topic2,
>> >>> "merge" them together in order to have the attributes from both
>> FlowFiles.
>> >>> After that I will probably need to use the GetMongo to access some
>> >>> information. This will probably create a new FlowFile that I need to
>> "merge"
>> >>> with the others.
>> >>> Then I'll put that in HBase or something else, not sure yet.
>> >>>
>> >>> The part that I'm not sure how to solve is the "merge" of multiples
>> >>> 

Re: Wait from multiple inputs before ending the flow

2015-12-12 Thread Louis-Étienne Dorval
Hi again,


The MergeContent works perfectly for my case! The flow I've described in
the previous email changed a bit, but still it's working as expected.

The only problem, now NiFi is much faster than the existing system running
in parallel (which is really good). I've done a "retry loop" describe below
but still it's too fast :

InvokeHttp -- (failure) --> UpdateAttribute (increment a counter) -->
RouteAtttribute (if lower than X will retry) --> InvokeHttp


Question: Is there something that already exist which could "sleep" a
FlowFile for X seconds before continuing?


Best regards and great job on the version 0.4.0, the Syslog feature is much
appreciated!
Louis-Etienne

PS: Let me know if I should have started a new email thread with that
question.


On 6 December 2015 at 23:30, Joe Witt  wrote:

> Louis-Etienne,
>
> NIFI-190 isn't scheduled on anything as of yet.  We had some design
> questions/ideas and your example informs it even further.
>
> I think the custom proc method you mention will work out well.
> Ultimately there will need to be one anyway to deal with the logic of
> merging this particular format+schema.
>
> Thanks
> Joe
>
> On Sun, Dec 6, 2015 at 11:28 PM,   wrote:
> > Joe,
> >
> > Thanks for the prompt reply.
> > About the merge, both message will be JSON and I need some specific part
> from both.
> > I'll recheck the doc to see what my options are, but I think that using
> FlowFile Streams and a custom processor that would do the logic might be
> good
> >
> > About the HoldProcessor, you must talk about NIFI-190. The way you
> describe it seems to what I'm looking for
> > But in the JIRA and looking quickly at the PR it seems like I would lose
> the message from Topic2.
> >
> > I'll dig in the code of the PR and the MergeContent processor in order
> to have a better understanding.
> >
> > Was that JIRA scheduled for a specific milestone? It would probably be a
> great addition but maybe it require a lot of change that I dont see yet
> >
> > Regards,
> > Louis-Etienne
> >
> >> On 6, 2015, at 9:42 PM, Joe Witt  wrote:
> >>
> >> Louis-Etienne,
> >>
> >> My initial thought is your idea with MergeContent is the right one.
> >> However, the issue there is not just the combining of the data but the
> >> 'what does merging truly mean in that case'.  So it is a bit undefined
> >> what the next step will be.  Merge the content?  If so, how?  What is
> >> the format and schema of the objects before the merge and after?
> >>
> >> Another member of the community had an idea for a concept of a
> >> HoldProcessor.  It would allow these sorts of multi-object gates to
> >> occur.  The same issue exists of what to do once the gate criteria is
> >> hit but at that point you'd have more control over it.  MergeContent
> >> is an already prescribed set of behaviors whereas HoldContent would
> >> let you choose the next gate.  We really should get on with helping
> >> get that contribution in.
> >>
> >> Thanks
> >> Joe
> >>
> >>> On Sun, Dec 6, 2015 at 9:35 PM, Louis-Étienne Dorval <
> ledor...@gmail.com> wrote:
> >>> Hi everyone!
> >>>
> >>> I'm very excited to start using NiFi and I think that it will be very
> >>> usefull for a some projects.
> >>>
> >>> I've been playing with it for some times and did a few basic flow, but
> I'm
> >>> having a hard time figuring how to achieve a part of my flow or if
> NiFi will
> >>> be able to do it.
> >>> I'm building a flow around existing systems, so NiFi would run in
> parallel
> >>> of that and gather the output of these systems (everything is
> asynchronous)
> >>> to take actions.
> >>>
> >>> Everything starts with a GetJMSTopic on Topic1, then follows 2-3
> processor
> >>> that does Attribute Extractions.
> >>> During that time the existing system will process the same message,
> enrich
> >>> the message (but also remove some usefull information) and will
> publish on
> >>> Topic2.
> >>> I need the message from Topic2, so I've added another GetJMSTopic on
> Topic2.
> >>> Then I need to somehow take the FlowFile from Topic1 and from Topic2,
> >>> "merge" them together in order to have the attributes from both
> FlowFiles.
> >>> After that I will probably need to use the GetMongo to access some
> >>> information. This will probably create a new FlowFile that I need to
> "merge"
> >>> with the others.
> >>> Then I'll put that in HBase or something else, not sure yet.
> >>>
> >>> The part that I'm not sure how to solve is the "merge" of multiples
> >>> FlowFile, I hesitate between using the MergeContent processor and the
> >>> DetectDuplicate:
> >>>
> >>> MergeContent seems like what I needs but the existing systems might
> add some
> >>> latency (and it will increase when there's a lot of publish on Topic1)
> so I
> >>> would need to increase the 'Maximum number of Bins'.
> >>> It will probably affect the performance of the system but how bad?
> >>> DetectDuplicate, it would feel akward to use that