[ https://issues.apache.org/jira/browse/NIFI-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941319#comment-16941319 ]
ASF subversion and git services commented on NIFI-6589: ------------------------------------------------------- Commit fbd6200ab3e4410fd9cf05f31348ec56a89d5af7 in nifi's branch refs/heads/master from Mark Payne [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=fbd6200 ] NIFI-6589: This closes #3670. Cache results from zookeeper when determining the leader NIFI-6589: Updated CuratorLeaderElectionManager to cache results for no more than 5 seconds per review feedback Signed-off-by: Joe Witt <joew...@apache.org> > Leader Election should cache results obtained from ZooKeeper > ------------------------------------------------------------ > > Key: NIFI-6589 > URL: https://issues.apache.org/jira/browse/NIFI-6589 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework > Reporter: Mark Payne > Assignee: Mark Payne > Priority: Minor > Fix For: 1.10.0 > > > In order to determine which node in a cluster is the Cluster Coordinator, a > node must make a request to ZooKeeper. That means that if we have N nodes in > a cluster, then we must ask ZooKeeper for each request at least (N+1) times > (and no more than N+2) who is the Cluster Coordinator. This is done because > when the request comes in, the node must determine whether or not it is the > Cluster Coordinator. If so, it must replicate the request to each node. If > not, it must forward the request to the Cluster Coordinator, which will then > do so. When the request is replicated, it will again check if it is the > cluster coordinator. If we instead cache the result of querying ZooKeeper for > a short period of time, say 1 minute, we can dramatically decrease the number > of times that we hit ZooKeeper. If the Coordinator / Primary Node changes in > the mean time, it should still be notified of the change asynchronously. > The polling is done currently because we've seen situations where the > asynchronous notification did not happen. But if we update the code so that > we cache the results, this means that we will also update the code for > caching results of which node is Primary Node. This is a benefit as well, > because currently we don't poll for this and as a result, if we do happen to > miss the notification, we could theoretically have 2 nodes running processors > should only run on Primary Node. -- This message was sent by Atlassian Jira (v8.3.4#803005)