Is there currently a way to know how many times a FlowFile has been penalized? Do we have use cases where we want to penalize a FlowFile *n *number of times before sending it down an alternate relationship? I could imagine an API like penalizeOrTransfer(FlowFile flowFile, int numberOfTries, Relationship relationship). For example, someone might want to process a FlowFile three times before giving up on it.
On Thu, Jan 28, 2016 at 12:47 PM, Michael de Courci < mdecou...@googlemail.com> wrote: > Matt thanks for your reply > > I guess what I am saying in that case - if there is an error in a > FlowFile, then the processor that detects this cannot proceed so instead of > calling an action to penalize the FlowFile it raises an exception > OutOFServiceException or ProcessorException. > You could have an exception cause PeanilisedFlowFileException for this > case. > > But within the processor other error causes may arise for an > OutOFServiceException > > The point is that if the processor threw this exception then there can be > a duration configuration - a time limit to keep this processor out of > service and the connection to it and possibly any processors leading upto > it - Naturally this will need to be indicated on the DFM - this will free > resources and make the flow well behaved. > > Environmental failures will simply be a different category/cause of error > that can be wrapped/captured also with a more general one > > With Kind Regards > Michael de Courci > mdecou...@gmail.com > > > > > > On 28 Jan 2016, at 17:16, Matt Gilman <matt.c.gil...@gmail.com> wrote: > > > > Just to recap/level set... > > > > The distinct between yielding and penalization is important. Penalization > > is an action taken on a FlowFile because the FlowFile cannot be processed > > right now (like a naming conflict for instance). The Processor is > > indicating that it cannot process that specific FlowFile at the moment > but > > may be able to process the next. Yielding is an indication that the > > Processor is unable to work at all at the moment likely due to an > > environmental issue (like the out of service comment). > > > > If the concept of penalization were moved to a connection, does it > > automatically penalize all FlowFile transferred to it? We would lose some > > granularity if a Processor wanted to penalize some FlowFile routed to a > > given Relationship but not others. I'm not sure if this is done in > practice > > or not, just wanted to mention it. > > > > Outside of this minor concern, I like the idea. I especially like that it > > would help with the consistency of Processor behavior and transparency > > about what the data flow is actually doing. > > > > Matt > > > > > > On Thu, Jan 28, 2016 at 12:00 PM, Michael de Courci < > > mdecou...@googlemail.com> wrote: > > > >> Hi > >> I think it would be better/simpler to have one “out of service” concept > >> to replace penalizing and yielding and when a plugin throws an exception > >> then the plugin is deemed out of service, for a duration and so the > >> connection to that plugin is disabled for the out of service duration. > >> > >> When a plugin is out of service and the connection disabled - then > >> resources that it uses will be freed(yielded). > >> > >> The question then is what the behaviour of the plugin before the > disabled > >> connection - should be. My thought is to tend towards stability and > make > >> sure resources are freed, so there may need to be a “domino > effect”/cascade > >> affect where all plugins before are gradually put out of service. > >> > >> > >> With Kind Regards > >> Michael de Courci > >> mdecou...@gmail.com > >> > >> > >> > >> > >>> On 28 Jan 2016, at 16:34, Mark Payne <marka...@hotmail.com> wrote: > >>> > >>> All, > >>> > >>> I've been thinking about how we handle the concept of penalizing > >> FlowFiles. We've had a lot of questions > >>> lately about how penalization works & the concept in general. Seems the > >> following problems exist: > >>> > >>> - Confusion about difference between penalization & yielding > >>> - DFM sees option to configure penalization period on all processors, > >> even if they don't penalize FlowFiles. > >>> - DFM cannot set penalty duration in 1 case and set a different value > >> for a different case (different relationship, for example). > >>> - Developers often forget to call penalize() > >>> - Developer has to determine whether or not to penalize when building a > >> processor. It is based on what the developer will > >>> think may make sense, but in reality DFM's sometimes want to penalize > >> things when the processor doesn't behave that way. > >>> > >>> I'm wondering if it doesn't make sense to remove the concept of > >> penalization all together from Processors and instead > >>> move the Penalty Duration so that it's a setting on the Connection. I > >> think this would clear up the confusion and give the DFM > >>> more control over when/how long to penalize. Could set to the default > to > >> 30 seconds for self-looping connections and no penalization > >>> for other connections. > >>> > >>> Any thoughts? > >>> > >>> Thanks > >>> -Mark > >> > >> > > -- Ricky Saltzer http://www.cloudera.com