[ https://issues.apache.org/jira/browse/NIFI-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matt Gilman updated NIFI-5204: ------------------------------ Resolution: Fixed Fix Version/s: 1.7.0 Status: Resolved (was: Patch Available) > When node joins cluster, if a processor is stopping but cluster says the > state is disabled, node ends up in inconsistent state > ------------------------------------------------------------------------------------------------------------------------------ > > Key: NIFI-5204 > URL: https://issues.apache.org/jira/browse/NIFI-5204 > Project: Apache NiFi > Issue Type: Bug > Reporter: Mark Payne > Assignee: Mark Payne > Priority: Critical > Fix For: 1.7.0 > > > In order to make this "easy" to replicate, I did the following: > 1) Create a 2-node cluster. > 2) On both nodes, update nifi.properties to set > "nifi.variable.registry.properties" to "1.properties" > 3) On both nodes, create 1.properties in $NIFI_HOME. For first node, set > "sleep=2 mins" and for second node, set "sleep=0 millis" > 4) Update DebugFlow to support expression language for the "@OnStopped Pause > Time" > 5) Configure flow with a DebugFlow processor. Can auto-terminate > relationships and set run period to "10 secs."Set "@OnStopped Pause time" to > "${sleep}" > 6) Disable DebugFlow processor. > 7) Disconnected Node 1. > 8) Go to Node 1 in browser and Start DebugFlow. > 9) Stop DebugFlow. > 10) While processor is still "stopping", go back Node 2 in browser and > request that Node 1 re-join the cluster. > Now, when Node 1 re-joins the cluster, it will attempt to disable the > processor but won't be able to because the processor is still stopping. The > following will be in the logs: > {code:java} > 2018-05-16 15:21:50,986 WARN [Reconnect to Cluster] > org.apache.nifi.controller.ProcessorNode Processor cannot be disabled because > its state is set to STOPPING{code} > So we now have a node in an inconsistent state. > Additionally, if we now go to Node 1 in our browser and unselect all > components, and attempt to STOP the process group, the request that is > replicated attempts to stop the DebugFlow processor. Node 2 will now fail to > stop the processor because the processor is disabled. As a result, Node 2 > will now be kicked out of the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)