[ https://issues.apache.org/jira/browse/ZOOKEEPER-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108113#comment-17108113 ]
Mate Szalay-Beko commented on ZOOKEEPER-3829: --------------------------------------------- Did you actually see in the logs this printout ["Shutting down"|https://github.com/apache/zookeeper/blob/e87bad6774e7269ef21a156aff9dad089ef54794/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/CommitProcessor.java#L632] before the {{start}} method would be called on the same CommitProcessor? I see this one in your logs: {code:java} 310 2020-05-14 18:04:12,023 [myid:4] - INFO [QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):FinalRequestProcessor@514] - shutdown of request processor complete {code} But this is about shutting down the {{FinalRequestProcessor}} and not the {{CommitProcessor}}. > Zookeeper refuses request after node expansion > ---------------------------------------------- > > Key: ZOOKEEPER-3829 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3829 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.5.6 > Reporter: benwang li > Priority: Major > Attachments: d.log > > > It's easy to reproduce this bug. > {code:java} > //代码占位符 > > Step 1. Deploy 3 nodes A,B,C with configuration A,B,C . > Step 2. Deploy node ` D` with configuration `A,B,C,D` , cluster state is ok > now. > Step 3. Restart nodes A,B,C with configuration A,B,C,D, then the leader will > be D, cluster hangs, but it can accept `mntr` command, other command like `ls > /` will be blocked. > Step 4. Restart nodes D, cluster state is back to normal now. > > {code} > > We have looked into the code of 3.5.6 version, and we found it may be the > issue of `workerPool` . > The `CommitProcessor` shutdown and make `workerPool` shutdown, but > `workerPool` still exists. It will never work anymore, yet the cluster still > thinks it's ok. > > I think the bug may still exist in master branch. > We have tested it in our machines by reset the `workerPool` to null. If it's > ok, please assign this issue to me, and then I'll create a PR. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)