RE: Make NiFi Flow Read Only on Disconnect
There may be some circumstances where the approach in https://issues.apache.org/jira/browse/NIFI-6849 would allow a node to re-join the cluster faster vs just waiting for the queues to empty and just overwriting the flow however I can imagine some edge cases where downstream connections might have changed leading to a different result even though the upstream connections haven't changed and had data queued. Note, I haven't looked at exactly how https://issues.apache.org/jira/browse/NIFI-6849 is comparing the changes so I may be completely off there. I think waiting till your queues are zeroed out and overwriting has a lot less room for error and is much simpler to implement. I don't have any jobs where a single flow lingers in NiFi for an extended period of time however other people might. Thanks Shawn -Original Message- From: Mark Payne Sent: Thursday, August 26, 2021 10:39 AM To: users@nifi.apache.org Subject: Re: Make NiFi Flow Read Only on Disconnect Bryan, Those changes only affect when a node starts up, not when it reconnects to a cluster after startup. However, there is a Jira [1] that I’m working on currently that should facilitate this approach. Thanks -Mark [1] https://issues.apache.org/jira/browse/NIFI-9069 > On Aug 26, 2021, at 11:34 AM, Bryan Bende wrote: > > Mark made a lot of improvements to this back in 1.12.0 [1]. > > I think you don't need to delete flow.xml.gz anymore, it will just > take what the cluster has, as long as it doesn't require removing > connections with data, and it backs up the current flow to a back up > directory. > > [1] https://issues.apache.org/jira/browse/NIFI-6849 > > On Thu, Aug 26, 2021 at 11:33 AM Joe Witt wrote: >> >> Shawn >> >> Ok cool. I think we'll go that route. A lot less code. A lot >> easier to reason over. We tried to be clever and it kept backfiring. >> Simpler wins. >> >> Thanks >> >> On Thu, Aug 26, 2021 at 8:28 AM Shawn Weeks >> wrote: >>> >>> As long as we have a check to make sure no data is in flow on the node >>> joining that sounds wonderful and a lot simpler. In my most recent case I >>> just stopped the inputs and waited till everything cleared and then >>> shutdown and delete flow.xml.gz and restarted. That's already worlds easier >>> than how it used to be. >>> >>> Thanks >>> Shawn >>> >>> -Original Message- >>> From: Joe Witt >>> Sent: Thursday, August 26, 2021 10:23 AM >>> To: users@nifi.apache.org >>> Subject: Re: Make NiFi Flow Read Only on Disconnect >>> >>> Shawn >>> >>> So that is one direction (being more restrictive). Another direction is to >>> simply ditch the logic we have and allow changes and simply update the >>> disconnected node when it rejoins. We have all kinds of super complex >>> super duper awesome logic in there to help prevent users from getting into >>> a bad state. What we have found is it was a lot of effort with minimal >>> payoff. The simpler model is simply 'add the node to the cluster, make >>> changes to ensure the flow matches, and move on'. >>> The only failure case would be when connecting and there is data in a >>> connection which no longer exists. We can make exception handling for that >>> mode. >>> >>> How does that sound for you? >>> >>> Thanks >>> >>> On Thu, Aug 26, 2021 at 7:51 AM Shawn Weeks >>> wrote: >>>> >>>> Hi, I know there have been a lot of improvements handling flow.xml.gz >>>> differences between nodes if a node get’s disconnected or is down. I was >>>> wondering if there is a way to prevent NiFi from allowing any flow changes >>>> if all nodes are not up and available, both on the node that’s in a >>>> “DISCONNECTED” state and for the remaining cluster nodes. I’m trying to >>>> prevent the scenarios where a node ends up a disconnected state but still >>>> allows changes making reconnections more challenging. >>>> >>>> >>>> >>>> Thanks >>>> >>>> Shawn
Re: Make NiFi Flow Read Only on Disconnect
Bryan, Those changes only affect when a node starts up, not when it reconnects to a cluster after startup. However, there is a Jira [1] that I’m working on currently that should facilitate this approach. Thanks -Mark [1] https://issues.apache.org/jira/browse/NIFI-9069 > On Aug 26, 2021, at 11:34 AM, Bryan Bende wrote: > > Mark made a lot of improvements to this back in 1.12.0 [1]. > > I think you don't need to delete flow.xml.gz anymore, it will just > take what the cluster has, as long as it doesn't require removing > connections with data, and it backs up the current flow to a back up > directory. > > [1] https://issues.apache.org/jira/browse/NIFI-6849 > > On Thu, Aug 26, 2021 at 11:33 AM Joe Witt wrote: >> >> Shawn >> >> Ok cool. I think we'll go that route. A lot less code. A lot easier >> to reason over. We tried to be clever and it kept backfiring. >> Simpler wins. >> >> Thanks >> >> On Thu, Aug 26, 2021 at 8:28 AM Shawn Weeks >> wrote: >>> >>> As long as we have a check to make sure no data is in flow on the node >>> joining that sounds wonderful and a lot simpler. In my most recent case I >>> just stopped the inputs and waited till everything cleared and then >>> shutdown and delete flow.xml.gz and restarted. That's already worlds easier >>> than how it used to be. >>> >>> Thanks >>> Shawn >>> >>> -Original Message- >>> From: Joe Witt >>> Sent: Thursday, August 26, 2021 10:23 AM >>> To: users@nifi.apache.org >>> Subject: Re: Make NiFi Flow Read Only on Disconnect >>> >>> Shawn >>> >>> So that is one direction (being more restrictive). Another direction is to >>> simply ditch the logic we have and allow changes and simply update the >>> disconnected node when it rejoins. We have all kinds of super complex >>> super duper awesome logic in there to help prevent users from getting into >>> a bad state. What we have found is it was a lot of effort with minimal >>> payoff. The simpler model is simply 'add the node to the cluster, make >>> changes to ensure the flow matches, and move on'. >>> The only failure case would be when connecting and there is data in a >>> connection which no longer exists. We can make exception handling for that >>> mode. >>> >>> How does that sound for you? >>> >>> Thanks >>> >>> On Thu, Aug 26, 2021 at 7:51 AM Shawn Weeks >>> wrote: >>>> >>>> Hi, I know there have been a lot of improvements handling flow.xml.gz >>>> differences between nodes if a node get’s disconnected or is down. I was >>>> wondering if there is a way to prevent NiFi from allowing any flow changes >>>> if all nodes are not up and available, both on the node that’s in a >>>> “DISCONNECTED” state and for the remaining cluster nodes. I’m trying to >>>> prevent the scenarios where a node ends up a disconnected state but still >>>> allows changes making reconnections more challenging. >>>> >>>> >>>> >>>> Thanks >>>> >>>> Shawn
Re: Make NiFi Flow Read Only on Disconnect
Mark made a lot of improvements to this back in 1.12.0 [1]. I think you don't need to delete flow.xml.gz anymore, it will just take what the cluster has, as long as it doesn't require removing connections with data, and it backs up the current flow to a back up directory. [1] https://issues.apache.org/jira/browse/NIFI-6849 On Thu, Aug 26, 2021 at 11:33 AM Joe Witt wrote: > > Shawn > > Ok cool. I think we'll go that route. A lot less code. A lot easier > to reason over. We tried to be clever and it kept backfiring. > Simpler wins. > > Thanks > > On Thu, Aug 26, 2021 at 8:28 AM Shawn Weeks wrote: > > > > As long as we have a check to make sure no data is in flow on the node > > joining that sounds wonderful and a lot simpler. In my most recent case I > > just stopped the inputs and waited till everything cleared and then > > shutdown and delete flow.xml.gz and restarted. That's already worlds easier > > than how it used to be. > > > > Thanks > > Shawn > > > > -Original Message----- > > From: Joe Witt > > Sent: Thursday, August 26, 2021 10:23 AM > > To: users@nifi.apache.org > > Subject: Re: Make NiFi Flow Read Only on Disconnect > > > > Shawn > > > > So that is one direction (being more restrictive). Another direction is to > > simply ditch the logic we have and allow changes and simply update the > > disconnected node when it rejoins. We have all kinds of super complex > > super duper awesome logic in there to help prevent users from getting into > > a bad state. What we have found is it was a lot of effort with minimal > > payoff. The simpler model is simply 'add the node to the cluster, make > > changes to ensure the flow matches, and move on'. > > The only failure case would be when connecting and there is data in a > > connection which no longer exists. We can make exception handling for that > > mode. > > > > How does that sound for you? > > > > Thanks > > > > On Thu, Aug 26, 2021 at 7:51 AM Shawn Weeks > > wrote: > > > > > > Hi, I know there have been a lot of improvements handling flow.xml.gz > > > differences between nodes if a node get’s disconnected or is down. I was > > > wondering if there is a way to prevent NiFi from allowing any flow > > > changes if all nodes are not up and available, both on the node that’s in > > > a “DISCONNECTED” state and for the remaining cluster nodes. I’m trying to > > > prevent the scenarios where a node ends up a disconnected state but still > > > allows changes making reconnections more challenging. > > > > > > > > > > > > Thanks > > > > > > Shawn
Re: Make NiFi Flow Read Only on Disconnect
Shawn Ok cool. I think we'll go that route. A lot less code. A lot easier to reason over. We tried to be clever and it kept backfiring. Simpler wins. Thanks On Thu, Aug 26, 2021 at 8:28 AM Shawn Weeks wrote: > > As long as we have a check to make sure no data is in flow on the node > joining that sounds wonderful and a lot simpler. In my most recent case I > just stopped the inputs and waited till everything cleared and then shutdown > and delete flow.xml.gz and restarted. That's already worlds easier than how > it used to be. > > Thanks > Shawn > > -Original Message- > From: Joe Witt > Sent: Thursday, August 26, 2021 10:23 AM > To: users@nifi.apache.org > Subject: Re: Make NiFi Flow Read Only on Disconnect > > Shawn > > So that is one direction (being more restrictive). Another direction is to > simply ditch the logic we have and allow changes and simply update the > disconnected node when it rejoins. We have all kinds of super complex super > duper awesome logic in there to help prevent users from getting into a bad > state. What we have found is it was a lot of effort with minimal payoff. > The simpler model is simply 'add the node to the cluster, make changes to > ensure the flow matches, and move on'. > The only failure case would be when connecting and there is data in a > connection which no longer exists. We can make exception handling for that > mode. > > How does that sound for you? > > Thanks > > On Thu, Aug 26, 2021 at 7:51 AM Shawn Weeks wrote: > > > > Hi, I know there have been a lot of improvements handling flow.xml.gz > > differences between nodes if a node get’s disconnected or is down. I was > > wondering if there is a way to prevent NiFi from allowing any flow changes > > if all nodes are not up and available, both on the node that’s in a > > “DISCONNECTED” state and for the remaining cluster nodes. I’m trying to > > prevent the scenarios where a node ends up a disconnected state but still > > allows changes making reconnections more challenging. > > > > > > > > Thanks > > > > Shawn
RE: Make NiFi Flow Read Only on Disconnect
As long as we have a check to make sure no data is in flow on the node joining that sounds wonderful and a lot simpler. In my most recent case I just stopped the inputs and waited till everything cleared and then shutdown and delete flow.xml.gz and restarted. That's already worlds easier than how it used to be. Thanks Shawn -Original Message- From: Joe Witt Sent: Thursday, August 26, 2021 10:23 AM To: users@nifi.apache.org Subject: Re: Make NiFi Flow Read Only on Disconnect Shawn So that is one direction (being more restrictive). Another direction is to simply ditch the logic we have and allow changes and simply update the disconnected node when it rejoins. We have all kinds of super complex super duper awesome logic in there to help prevent users from getting into a bad state. What we have found is it was a lot of effort with minimal payoff. The simpler model is simply 'add the node to the cluster, make changes to ensure the flow matches, and move on'. The only failure case would be when connecting and there is data in a connection which no longer exists. We can make exception handling for that mode. How does that sound for you? Thanks On Thu, Aug 26, 2021 at 7:51 AM Shawn Weeks wrote: > > Hi, I know there have been a lot of improvements handling flow.xml.gz > differences between nodes if a node get’s disconnected or is down. I was > wondering if there is a way to prevent NiFi from allowing any flow changes if > all nodes are not up and available, both on the node that’s in a > “DISCONNECTED” state and for the remaining cluster nodes. I’m trying to > prevent the scenarios where a node ends up a disconnected state but still > allows changes making reconnections more challenging. > > > > Thanks > > Shawn
Re: Make NiFi Flow Read Only on Disconnect
Shawn So that is one direction (being more restrictive). Another direction is to simply ditch the logic we have and allow changes and simply update the disconnected node when it rejoins. We have all kinds of super complex super duper awesome logic in there to help prevent users from getting into a bad state. What we have found is it was a lot of effort with minimal payoff. The simpler model is simply 'add the node to the cluster, make changes to ensure the flow matches, and move on'. The only failure case would be when connecting and there is data in a connection which no longer exists. We can make exception handling for that mode. How does that sound for you? Thanks On Thu, Aug 26, 2021 at 7:51 AM Shawn Weeks wrote: > > Hi, I know there have been a lot of improvements handling flow.xml.gz > differences between nodes if a node get’s disconnected or is down. I was > wondering if there is a way to prevent NiFi from allowing any flow changes if > all nodes are not up and available, both on the node that’s in a > “DISCONNECTED” state and for the remaining cluster nodes. I’m trying to > prevent the scenarios where a node ends up a disconnected state but still > allows changes making reconnections more challenging. > > > > Thanks > > Shawn