Re: Delay Processor

2016-11-15 Thread Andrew Grande
Joe,

It's good to know some thinking went into this feature before. Basically,
I'm trying to put a spotlight on these 2 areas:

   1. Making it, potentially, more generic than a retry loop. E.g. enhance
   the ControlRate processor.
   2. Making these policies more explicit, so a user wouldn't have to
   second-guess on where to go to configure behavior. Here I don't have a
   strong opinion on whether a dedicated (or enhanced) processor or changes
   into standard processor configuration screens will be the way.

Andrew


Re: Delay Processor

2016-11-15 Thread Andrew Grande
Oleg,

I'll break my response in 2 threads. I understand your use case of 'delay
until X', but frankly would design it differently if we're talking about
long-term transactions or schedules. It may involve systems external to
NiFi. Anyway, I'd like to keep this use case out of scope for the delay
processor, at least for the immediate discussion.

Now jumping to another thread..


Re: Delay Processor

2016-11-15 Thread Joe Witt
The concept of flowfile penalization exists to support many of the
requirements for delay.  This is something which can be set, by a
component (processor), on a flow file for some given period of time
and which will be honored on the connection it is placed on such that
things without this penalization will be higher priority.  It was
built for the typical failure loops.  This is appropriate when it is
something about that 'flowfile' and the thing the processor it trying
to interact with that is the problem and therefore simply marking that
flowfile as problematic for a bit is likely to allow whatever the
condition is to go away without just making the problem worse.

There is also the concept of process yielding.  A component
(processor) can signal that it should yield which means it will not be
triggered again at its normally scheduled time but rather will yield
until its yield period is up.  This makes sense when there is
something about the state of that processor and its configuration
relative to whatever it is trying to do/interacting with which is
expected to be able to go away or be resolved on its own.

Mark Payne and I have talked in the past about the notion of the
framework automatically tracking flowfiles that appear to be looping
and doing exponential backoff on them.  This would be done just as you
suggested which is via flow file attributes.

Is your case/concern more like the 'something is wrong with this flow
file vs environment' or more like 'something is wrong with this
process vs environment'?



On Tue, Nov 15, 2016 at 8:50 AM, Andrew Grande  wrote:
> Hi,
>
> I'd lime to check where discussions are on this, ir propose the new
> component otherwise.
>
> Use case: make delay strategies explicit, easier to use. E.g. think of a
> failure retry loop.
>
> Currently, ControlRate is somewhat related, but can be improved. E.g.
> introduce delay strategies a la prioritizers on the connection?
>
> Thinking out loud, something like a exponential backoff strategy could be
> kept stateless by adding a number of housekeeping attributes to the FF,
> which eliminates the need for any state in the processor itself.
>
> I'll stop here to see if any ideas were captured prior abd what community
> thinks of it.
>
> Andrew


Re: Delay Processor

2016-11-15 Thread Oleg Zhurakousky
I am +1 on this as I’ve seen many cases in the field where it is applicable and 
as you mention exponential back off is one of the common one. That said, I am 
wondering if that has to be a processor at all? Actually let me answer my own 
question. There are definitely cases where it has to be a processor. Those are 
true delay with intention requirements (i.e., Compute Andrew’s greetings -> 
Delay until his b-day -> Send Greetings). 
The exponential back off is a bit different since it almost aligns with circuit 
breaker and re-tries. Currently we simply retry by resubmitting the flow file 
with fixed yield. However one may argue that if something failed the first 
time, it is very likely that it is going to fail again and again. It may also 
succeed, but one may argue that it has a higher chance of succeeding after 
certain delay which increases on subsequent failures until we may choose to 
consider it a failed cause and stop resubmitting it.

So in summary I see the two-part requirement;  A processor and enhancement to 
the core-framework’s retry logic.

Thoughts?
Cheers
Oleg


> On Nov 15, 2016, at 8:50 AM, Andrew Grande  wrote:
> 
> Hi,
> 
> I'd lime to check where discussions are on this, ir propose the new component 
> otherwise.
> 
> Use case: make delay strategies explicit, easier to use. E.g. think of a 
> failure retry loop.
> 
> Currently, ControlRate is somewhat related, but can be improved. E.g. 
> introduce delay strategies a la prioritizers on the connection?
> 
> Thinking out loud, something like a exponential backoff strategy could be 
> kept stateless by adding a number of housekeeping attributes to the FF, which 
> eliminates the need for any state in the processor itself.
> 
> I'll stop here to see if any ideas were captured prior abd what community 
> thinks of it.
> 
> Andrew



Delay Processor

2016-11-15 Thread Andrew Grande
Hi,

I'd lime to check where discussions are on this, ir propose the new
component otherwise.

Use case: make delay strategies explicit, easier to use. E.g. think of a
failure retry loop.

Currently, ControlRate is somewhat related, but can be improved. E.g.
introduce delay strategies a la prioritizers on the connection?

Thinking out loud, something like a exponential backoff strategy could be
kept stateless by adding a number of housekeeping attributes to the FF,
which eliminates the need for any state in the processor itself.

I'll stop here to see if any ideas were captured prior abd what community
thinks of it.

Andrew