Lucas

The change proposed is specific to the original intent, implementation,
evolution, and bridging to new intent for certain cases of that specific
processor.

I remain supportive of processors using the capabiltities of the api and
framework to best meet the user needs.  “failure” is just a string. The
meaning of that depends on a processor and what it does.

Thanks

On Fri, Feb 9, 2024 at 5:04 PM Lucas Ottersbach <lucas.ottersb...@gmail.com>
wrote:

> I think that's a good approach which actually addresses the underlying
> issue. Thank you Joe, Mark and all others.
>
> As far as I know the default last resort behaviour of rollback + yield,
> that a lot processors exhibit, is due to them being based on
> AbstractProcessor.
>
> Does it make sense to incorporate the outlined approach into
> AbstractProcessor instead of UpdateAttribute?
> This way, other processors can undergo the same (opt-in) behavioural change
> without having to re-implement it on a per processor basis.
> Every processor in question only would need to add the property declared by
> AbstractProcessor to its list of properties. Processors that do not include
> the property wouldn't support to configure the behaviour and thus default
> to the current behaviour instead.
>
> I think that would provide a good middle ground, both for users that want a
> explicit failure relationship and for those that rather want a simpler flow
> and ensure those kinds of errors won't happen another way.
>
> What do you think?
>
> Lucas
>
> Joe Witt <joe.w...@gmail.com> schrieb am Sa., 10. Feb. 2024, 00:06:
>
> > Lots of good commentary and great focus on minimizing impact to the users
> > while fixing what is admittedly not the most desired behavior for some
> > cases as it relates to the very popular UpdateAttribute.
> >
> > We do not want to enforce that all processors have failure relationships.
> > Presumably that notion is specific to processors which take an input
> > flowfile.  Even then though we have many examples where having a failure
> > relationship does not add value.  A few examples such as DistributeLoad,
> > DuplicateFlowFile and others which have concepts such as 'unmatched'
> etc..
> > Certainly most things can and should but we dont need a strict policy
> > here.  We should still let people building processors be thoughtful about
> > what is best.
> >
> > UpdateAttribute specifically... There is history to why it works the way
> it
> > does.  Things changed and it didn't evolve or couldn't because its use
> was
> > so widespread and we didnt want to create too much pain for users.  But
> > because of 2.0 and some improvements like the ability to code up
> migration
> > behaviors on a per extension basis we can work our way out of this
> without
> > causing pain for the users.
> >
> > For NiFi 1.x it should stay as is.
> > For NiFi 2.x a solution is outlined in NIFI-6344.  It reads:
> > Given the new capabilities for migrating configs in NiFi 2.0 we can fix
> > this.
> >
> > Add a property to UpdateAttribute that is 'Failure Strategy' and the
> > options are 'rollback' or 'route to failure'. If that property is set
> with
> > rollback it behaves like it does now and I recommend that remain the
> > default. If that property is set to 'route to failure' then we add a
> > relationship which needs to be set which is of course called 'failure'.
> For
> > flows being migrated from a version before this behavior was available
> to a
> > version that has this capability we just set the value of this parameter
> to
> > our default.
> >
> > This lets existing flows migrate over just fine. It lets us give users a
> > failure path for the cases they want one. It lets us keep the vast
> majority
> > of flows and uses of this where failure is not relevant stay clean. And
> it
> > handles migration.
> >
> > The processor needs to be updated to catch the exceptions and then follow
> > this logic. Today it just lets it fly to the framework which causes the
> > processor to yield and penalizes the flowfile for the default time. When
> > now catching the problem we should just avoid yielding and instead
> penalize
> > the specific offending flowfile which lets everything else operate super
> > fast.
> >
> > Thanks to Mark Payne for the chat on this.
> >
> > This can be done at any time by anyone that wants to take it on.  It is
> not
> > a blocker for nifi 2.x.  The migration capabilities give us really nice
> > options for many cases we've hit over the years going forward.
> >
> > Thanks
> >
> >
> > On Fri, Feb 9, 2024 at 2:53 PM Adam Taft <a...@adamtaft.com> wrote:
> >
> > > I mean, this really speaks to the principal that (in my humble opinion)
> > All
> > > Processors Shalt Have Failure Relationships as best practice.
> > >
> > > So I think the decision is really, how much pain do we want to impose
> on
> > > NiFi 2.0 adopters. UpdateAttribute and RouteOnAttribute are used
> > literally
> > > everywhere (especially on large installations), and it would be quite
> the
> > > chore(!) to update these if we were to make a non-compatible change for
> > > 2.0.  But 2.0 is also a very logical time/place to make such backwards
> > > breaking changes to UpdateAttribute, RouteOnAttribute and/or other
> > > processors that are missing failure relationships.
> > >
> > > I just fundamentally have never liked the lack of control for flowfiles
> > > getting requeued if a processor exception is left uncaught. You are
> > almost
> > > always just going to repeat the failure condition, it's rarely useful
> to
> > > retry without some sort of deliberate flow manager action. The
> > enhancements
> > > to relationships (to enable retries, backoffs, etc.) even reinforces my
> > > point of view that requeuing is almost always wrong.
> > >
> > > So I think the decision here is how much pain is this going to cause
> for
> > > NiFi 2.0 adoption? And if it's too much pain, then we should do
> something
> > > in a backwards compatible kind of way (like offering a dynamic failure
> > > relationship for those target processors). And if it's not too much
> pain,
> > > just add failure relationships directly to the processors and rip the
> > > bandaid off.
> > >
> > > A solution like a "try/catch" option to the expression language is not
> > that
> > > exciting to me. I do believe that adoption will be hindered if we make
> a
> > > breaking change because of the high usage of Update and Route
> processors.
> > > And so 'm concluding that a dynamic relationship approach, leaving the
> EL
> > > as designed, is the best path forward. But I'm hoping there's even a
> more
> > > clever solution out there waiting for someone to bring forward.
> > >
> > > I do believe whatever we do should be a 2.0 thing. Let's not consider
> any
> > > changes on this for 1.x (unless it somehow enables better adoption).
> > >
> > > /Adam
> > >
> > > On Fri, Feb 9, 2024 at 1:27 PM u...@moosheimer.com <u...@moosheimer.com>
> > > wrote:
> > >
> > > > Yes, that certainly involves a lot of effort.
> > > > I wonder whether it's a good idea to fix a possible "design flaw"
> with
> > a
> > > > construct that is neither consistent nor easy to handle.
> > > >
> > > > I also think the argument that you have to adjust a lot in the flow
> is
> > > > questionable.
> > > > With Release 2.0, so much has to be adapted (cron jobs, database
> schema
> > > > controller etc.) that it hardly matters whether you adapt the
> > > > UpdateAttribute processors or not.
> > > >
> > > > The effort to adapt and test everything is already present with the
> > > > upgrade to 2.0.
> > > > Doesn't Release 2.0 actually offer a really good opportunity to adapt
> > > > everything that has been put off for a long time?
> > > >
> > > > Who knows when the next opportunity will come? With Release 3.0? Or
> not
> > > at
> > > > all, because NiFi is becoming more and more widespread and the
> customer
> > > > lobby is getting bigger and bigger? Cloudera will find it
> increasingly
> > > > difficult to pack "large-scale" customizations into a release.
> > > >
> > > > Whether we wait 3-4 months or 5-6 months for the 2.0 release doesn't
> > > > really matter, does it?
> > > > But what do I know ... the NiFi experts and full-time maintainers
> will
> > > > certainly do the right thing.
> > > >
> > > > -- Uwe
> > > >
> > > > > Am 09.02.2024 um 22:04 schrieb Michael Moser <moser...@gmail.com>:
> > > > >
> > > > > These are great ideas!  I do love Adam's dynamic relationship idea
> > > over
> > > > > creating a failure relationship that is auto-terminated. This would
> > > make
> > > > > flow migrations in Registry and NiFi easier.
> > > > >
> > > > > After some more pondering, I (slowly) realized that this problem
> > > affects
> > > > > more than UpdateAttribute, though.  You can easily get expression
> > > > language
> > > > > (explang) exceptions anywhere that explang is used.  I'm sure all
> of
> > us
> > > > can
> > > > > create a RouteOnAttribute configuration that causes an explang
> > > exception
> > > > > which rolls back the flowfile.  If we spend so much effort on a
> > > solution
> > > > it
> > > > > would be a shame for that to only apply to UpdateAttribute.
> > > > >
> > > > > For this reason I would favor Matt's idea of a try() or trycatch()
> > > > explang
> > > > > method.  How it might work isn't intuitive, though.  Would we have
> to
> > > > make
> > > > > it aware of a data type to return?  RouteOnAttribute expects
> boolean
> > > but
> > > > > UpdateAttribute expects string (or convertible to string).
> > > > >
> > > > > I can see why the rollback/admin yield solution has been status quo
> > for
> > > > so
> > > > > long.
> > > > >
> > > > > -- Mike
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >> On Fri, Feb 9, 2024 at 10:03 AM u...@moosheimer.com <
> > > u...@moosheimer.com>
> > > > >> wrote:
> > > > >>
> > > > >> 👍
> > > > >> --Uwe
> > > > >>
> > > > >>>> Am 09.02.2024 um 13:50 schrieb Mike Thomsen <
> > mikerthom...@gmail.com
> > > >:
> > > > >>>
> > > > >>> How about a third option which is to provide three options:
> > > > >>>
> > > > >>> 1) Default - status quo, exceptions cause it to yield
> > > > >>> 2) Exception = moves forward to success w/ an error attribute, an
> > > error
> > > > >> log
> > > > >>> statement that triggers a bulletin, etc to let data manages know
> > > what's
> > > > >>> happening.
> > > > >>> 3) Exception = moves to a failure relationship that is otherwise
> > > > >>> autoterminated
> > > > >>>
> > > > >>>> On Thu, Feb 8, 2024 at 7:12 PM Matt Burgess <
> mattyb...@apache.org
> > >
> > > > >> wrote:
> > > > >>>>
> > > > >>>> Mike's option #2 seems solid but would take a lot of work and
> > there
> > > > will
> > > > >>>> always be inputs we don't account for. I support that work but
> in
> > > code
> > > > >>>> sometimes we just do a "catch(Throwable)" just so it doesn't
> blow
> > > up.
> > > > >> What
> > > > >>>> about a subjectless "try" or "trycatch" function you can wrap
> > around
> > > > >> your
> > > > >>>> whole expression? If no exception is thrown, the evaluated value
> > > will
> > > > be
> > > > >>>> returned but if one is thrown, you can provide some alternate
> > value
> > > > that
> > > > >>>> you can check downstream. As this is optional it would retain
> the
> > > > >> current
> > > > >>>> behavior unless you use it, and then it takes the place of all
> > those
> > > > >>>> ifElse(isXYZValid()) calls we'd need throughout the expression.
> > > > >>>>
> > > > >>>> Regards,
> > > > >>>> Matt
> > > > >>>>
> > > > >>>>
> > > > >>>> On Wed, Feb 7, 2024 at 8:11 PM Phillip Lord <
> > phillord0...@gmail.com
> > > >
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> IMO... UpdateAttribute has been around since the beginning of
> > > time, I
> > > > >>>> can't
> > > > >>>>> see adding a failure relationship. At the same time I
> understand
> > > the
> > > > >> want
> > > > >>>>> for such exceptions to be handled more gracefully rather than
> > > rolling
> > > > >>>> back
> > > > >>>>> indefinitely.
> > > > >>>>> I'd vote in favor of considering Moser's option #2... and being
> > > able
> > > > to
> > > > >>>>> implement an "if this then that" logic within your flow.
> > > > >>>>>
> > > > >>>>> Also just thinking... for every UA failure you have to
> consider a
> > > > good
> > > > >>>>> failure-management strategy, which MIGHT add a lot of noise to
> > the
> > > > >> flow.
> > > > >>>>> Something that might otherwise easily be identified in a
> > downstream
> > > > >>>>> component and/or database/etc.
> > > > >>>>>
> > > > >>>>> My 2 cents **
> > > > >>>>> Phil
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>> On Wed, Feb 7, 2024 at 5:18 PM Adam Taft <a...@adamtaft.com>
> > > wrote:
> > > > >>>>>
> > > > >>>>>> Or better, the failure relationship just doesn't even exist
> > until
> > > > the
> > > > >>>>>> property "Has Failure Relationship" is set to True.  This
> > involves
> > > > >>>>> updating
> > > > >>>>>> UpdateAttribute to have dynamic relationships (the failure
> > > > >>>> relationships
> > > > >>>>>> appearing on true), which isn't hard to do in processor code.
> > > > >>>>>>
> > > > >>>>>> This has the advantage of being backwards compatible for
> > existing
> > > > >> users
> > > > >>>>> and
> > > > >>>>>> allows the failure relationship to exist for new
> configurations.
> > > > >>>>> Obviously
> > > > >>>>>> the processor would need an update to catch Expression
> Language
> > > > >>>>> exceptions
> > > > >>>>>> and then route conditionally to failure.
> > > > >>>>>>
> > > > >>>>>> Just thinking out loud.
> > > > >>>>>> /Adam
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On Wed, Feb 7, 2024 at 1:48 PM u...@moosheimer.com <
> > > > u...@moosheimer.com
> > > > >>>
> > > > >>>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>> Hi Mike,
> > > > >>>>>>>
> > > > >>>>>>> How about the option of introducing a new property that
> decides
> > > > >>>> whether
> > > > >>>>>> to
> > > > >>>>>>> route to the 'failure' relationship in the event of an error?
> > > > >>>>>>> If this property is set to false, then the 'failure'
> > relationship
> > > > is
> > > > >>>>>>> automatically set to 'terminate' (since nothing is routed
> there
> > > > >>>>> anyway).
> > > > >>>>>>>
> > > > >>>>>>> Then everyone can decide whether and where they want to use
> > this
> > > > new
> > > > >>>>>>> feature or not.
> > > > >>>>>>> All other options would still be possible with such a
> solution.
> > > > >>>>>>>
> > > > >>>>>>> -- Uwe
> > > > >>>>>>>
> > > > >>>>>>>> Am 07.02.2024 um 22:15 schrieb Michael Moser <
> > > moser...@gmail.com
> > > > >:
> > > > >>>>>>>>
> > > > >>>>>>>> Hi Dan,
> > > > >>>>>>>>
> > > > >>>>>>>> This has been discussed in the past, as you found with those
> > two
> > > > >>>> Jira
> > > > >>>>>>>> tickets.  Personally, I'm still not sure whether a new
> failure
> > > > >>>>>>> relationship
> > > > >>>>>>>> on UpdateAttribute in 2.0 is a good approach.  I have heard
> > from
> > > > >>>> some
> > > > >>>>>>>> dataflow managers that would not want to go through their
> > entire
> > > > >>>>> graph
> > > > >>>>>>> when
> > > > >>>>>>>> upgrading to 2.0 and update every UpdateAttribute
> > configuration.
> > > > >>>>>>>>
> > > > >>>>>>>> I have heard some alternatives to a 'failure' relationship
> > that
> > > I
> > > > >>>>> would
> > > > >>>>>>>> like to share as options.
> > > > >>>>>>>>
> > > > >>>>>>>> 1) Add a new property to UpdateAttribute that controls
> > whether a
> > > > >>>>>> flowfile
> > > > >>>>>>>> that causes an expression language exception either yields
> and
> > > > >>>> rolls
> > > > >>>>>>> back,
> > > > >>>>>>>> or silently fails to update the attribute and sends the
> > flowfile
> > > > to
> > > > >>>>>>>> success.  I personally don't like this, because the use case
> > for
> > > > >>>>>> "silent
> > > > >>>>>>>> failure" seems really like a rarely needed edge case.
> > > > >>>>>>>>
> > > > >>>>>>>> 2) Identify all expression language methods that can throw
> an
> > > > >>>>> exception
> > > > >>>>>>> and
> > > > >>>>>>>> document that fact in the Expression Language Guide (some
> > > methods
> > > > >>>>>> already
> > > > >>>>>>>> mention they can throw an "exception bulletin").  Then
> > implement
> > > > >>>> new
> > > > >>>>>>>> expression methods to check if an expression could fail, and
> > use
> > > > >>>> that
> > > > >>>>>> in
> > > > >>>>>>>> UpdateAttribute advanced rules.  For example, if the
> format()
> > > and
> > > > >>>>>>>> formatInstant() methods can fail on a negative number, we
> > > create a
> > > > >>>>> new
> > > > >>>>>>>> method such as isValidMilliseconds().  This already exists
> for
> > > > some
> > > > >>>>>>> cases,
> > > > >>>>>>>> such as isJson() which can do a quick check of some value
> > before
> > > > >>>>>> calling
> > > > >>>>>>>> jsonPathDelete() on it.
> > > > >>>>>>>>
> > > > >>>>>>>> I'm curious to hear more thoughts on this.
> > > > >>>>>>>>
> > > > >>>>>>>> -- Mike
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>> On Wed, Jan 31, 2024 at 11:02 AM Dan S <dsti...@gmail.com>
> > > > wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>> My team is requesting a failure relationship for
> > > UpdateAttribute
> > > > >>>> as
> > > > >>>>>>> seen in
> > > > >>>>>>>>> NIFI-5448 <https://issues.apache.org/jira/browse/NIFI-5448
> >
> > > and
> > > > >>>>>>> NIFI-6344
> > > > >>>>>>>>> <https://issues.apache.org/jira/browse/NIFI-6344> as we
> are
> > > > >>>>>>>>> experiencing the same problem where a NIFI Expression
> > Language
> > > is
> > > > >>>>>>> throwing
> > > > >>>>>>>>> an exception. In the PR for NIFI-5448 it was mentioned this
> > > > >>>> feature
> > > > >>>>>>> would
> > > > >>>>>>>>> have to wait until NIFI 2.0.0. I wanted to know if there is
> > any
> > > > >>>>> active
> > > > >>>>>>> work
> > > > >>>>>>>>> regarding this and whether eventually there will be a
> failure
> > > > >>>>>>> relationship
> > > > >>>>>>>>> added to UpdateAttribute?
> > > > >>>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to