[
https://issues.apache.org/jira/browse/NIFI-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479251#comment-16479251
]
ASF GitHub Bot commented on NIFI-5204:
--------------------------------------
Github user mcgilman commented on the issue:
https://github.com/apache/nifi/pull/2713
Will review...
> When node joins cluster, if a processor is stopping but cluster says the
> state is disabled, node ends up in inconsistent state
> ------------------------------------------------------------------------------------------------------------------------------
>
> Key: NIFI-5204
> URL: https://issues.apache.org/jira/browse/NIFI-5204
> Project: Apache NiFi
> Issue Type: Bug
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Critical
>
> In order to make this "easy" to replicate, I did the following:
> 1) Create a 2-node cluster.
> 2) On both nodes, update nifi.properties to set
> "nifi.variable.registry.properties" to "1.properties"
> 3) On both nodes, create 1.properties in $NIFI_HOME. For first node, set
> "sleep=2 mins" and for second node, set "sleep=0 millis"
> 4) Update DebugFlow to support expression language for the "@OnStopped Pause
> Time"
> 5) Configure flow with a DebugFlow processor. Can auto-terminate
> relationships and set run period to "10 secs."Set "@OnStopped Pause time" to
> "${sleep}"
> 6) Disable DebugFlow processor.
> 7) Disconnected Node 1.
> 8) Go to Node 1 in browser and Start DebugFlow.
> 9) Stop DebugFlow.
> 10) While processor is still "stopping", go back Node 2 in browser and
> request that Node 1 re-join the cluster.
> Now, when Node 1 re-joins the cluster, it will attempt to disable the
> processor but won't be able to because the processor is still stopping. The
> following will be in the logs:
> {code:java}
> 2018-05-16 15:21:50,986 WARN [Reconnect to Cluster]
> org.apache.nifi.controller.ProcessorNode Processor cannot be disabled because
> its state is set to STOPPING{code}
> So we now have a node in an inconsistent state.
> Additionally, if we now go to Node 1 in our browser and unselect all
> components, and attempt to STOP the process group, the request that is
> replicated attempts to stop the DebugFlow processor. Node 2 will now fail to
> stop the processor because the processor is disabled. As a result, Node 2
> will now be kicked out of the cluster.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)