Re: [DISCUSS] Run Once scheduling

Irizarry Jr., Nazario Thu, 12 Jan 2017 11:06:02 -0800

The users that I work with are in the data-analytic space.  Without NiFi what 
they tend to do is to build scripts, often shell scripts that they edit and 
modify from run to run.  Thus for this class of user it is not really about 
continuous flows.  But, the connect-the-box metaphor is nonetheless a very good 
way to automate what they do and get them away from lots of script editing.


Thus for that type of application and those types of users when one builds a 
script each step has either been executed or it is going to be executed.  (for 
discussions sake lets ignore the conditional execution).  That is why the 
ability to create a flow in which items have or have not run is a natural way 
to migrate from scripts to flows.

Naz Irizarry
MITRE Corp.
617-893-0074



> On Jan 12, 2017, at 1:16 PM, Oleg Zhurakousky <ozhurakou...@hortonworks.com> 
> wrote:
> 
> I was just about to suggest the same. 
> Run-once would be a bit counter intuitive to the flow processing as a 
> concept. Basically think of it this way; Flow or parts of it have only two 
> states - RUNNING or STOPPED. In the RUNNING state it processes the data as it 
> arrives (every second, every minute or every day etc). Indeed there may be a 
> concern that the processor will do a lot of 'dry’ spins if no data is 
> available but fortunately NiFi allows you to limit the impact of that by 
> configuring “yield duration’. By default it is set to 1 sec, but for your 
> case you may wan to set it to 1 hour or so essentially controlling the 
> scheduling of such processor between ‘dry’ spins.
> 
> That said and just to entertain the idea of Run Once, what do you think 
> should be the processor state if it did ran once? Let’s assume it did and 
> somehow was stopped. . . then what? The data arrived on the incoming queue, 
> but nothing is processed until someone manually goes and re-starts the 
> processor. Right?
> I mean from the general workflow standpoint the concern is very valid, but 
> from flow processing the fact that NiFi does not support it is actually more 
> of a feature rather then lack of functionality.
> 
> Thoughts?
> 
> Cheers
> Oleg
> 
>> On Jan 12, 2017, at 1:02 PM, Joe Witt <joe.w...@gmail.com> wrote:
>> 
>> Naz,
>> 
>> Why not just leave all the processes running?  If the data only
>> arrives periodically that is ok, right?
>> 
>> Thanks
>> Joe
>> 
>> On Thu, Jan 12, 2017 at 10:54 AM, Irizarry Jr., Nazario <n...@mitre.org> 
>> wrote:
>>> On a project that I am on we have been looking at using NiFi for 
>>> orchestrations that are invoked infrequently.  For example, once a month a 
>>> new data input product becomes available and then one wants to run it 
>>> through a set of processing steps that can be nicely implemented using NiFi 
>>> processors.  However, using the interval or cron scheduling for this 
>>> purpose begins to get cumbersome after a while with the need to start and 
>>> manually stop these occasional flows.
>>> 
>>> It would be fairly easy to add an additional scheduling option - “Run Once” 
>>> for this use case.  The behavior would be that when a processor is set to 
>>> run once it automatically stops after it has successfully processed one 
>>> input.
>>> 
>>> What do people think?  We are willing to implement this small enhancement.
>>> 
>>> Cheers,
>>> 
>>> Naz Irizarry
>>> MITRE Corp.
>>> 617-893-0074
>>> 
>>> 
>>> 
>> 
>

Re: [DISCUSS] Run Once scheduling

Reply via email to