Gangadhar created ZOOKEEPER-3906: ------------------------------------ Summary: Data Inconsistency Between Zookeeper Leader and zookeeper Followers Key: ZOOKEEPER-3906 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3906 Project: ZooKeeper Issue Type: Bug Components: quorum Affects Versions: 3.5.7 Reporter: Gangadhar
Issue: Data Inconsistency Between Zookeeper Leader and zookeeper Followers. When we try to do the topic lookup for one of the topics I got broker not part of the cluster and verified below things as part of troubleshooting. Steps followed as part of troubleshooting: We have 5 zookeeper cluster. *Step1:* verified all zookeepers are following the leader or not?. As per below information its following all 4 zookeepers to zookeeper leader zk_version 3.5.7-f0fdd52973d373ffd9c86b81d99842dc2c7f660e, built on 02/11/2020 11:30 GMT zk_avg_latency 0 zk_max_latency 823 zk_min_latency 0 zk_packets_received 30214264 zk_packets_sent 32424272 zk_num_alive_connections 7 zk_outstanding_requests 0 zk_server_state leader zk_znode_count 75190 zk_watch_count 21394 zk_ephemerals_count 793 zk_approximate_data_size 24706628 zk_open_file_descriptor_count 281 zk_max_file_descriptor_count 4096 zk_followers 4 zk_synced_followers 4 zk_pending_syncs 0 zk_last_proposal_size 166 zk_max_proposal_size 121947 zk_min_proposal_size 32 *Step 2:* Verified namespace bundle in all the zookeepers using the below command. We have received information from all zookeepers. except for Leader zookeeper. ./pulsar zookeeper-shell get /namespace/$tenant/$Namespace/$Bubdle *Step 3:* Try to delete the Namespace/$Bubdle to own the topic to another broker. ./pulsar zookeeper-shell deleteall /namespace/$tenant/$Namespace/$Bubdle *Error:* 14:04:54.769 [main] INFO org.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer value is 10485760 Bytes 14:04:54.775 [main] INFO org.apache.zookeeper.ClientCnxn - zookeeper.request.timeout value is 0. feature enabled= 14:04:54.824 [main-SendThread(11.111.226.146:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 11-111-226-146.ebiz.verizon.com/11.111.226.146:2181. Will not attempt to authenticate using SASL (unknown error) 14:04:54.831 [main-SendThread(11.111.226.146:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established, initiating session, client: /11.111.225.75:38804, server: 11-111-226-146.ebiz.verizon.com/11.111.20.146:2181 14:04:54.835 [main-SendThread(11.111.226.146:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server 11-111-226-146.ebiz.verizon.com/11.111.226.146:2181, sessionid = 0x500001bbbeb0651, negotiated timeout = 20000 WATCHER:: WatchedEvent state:SyncConnected type:None path:null *Node does not exist: /namespace/$tenant/$Namespace/$Bubdle* -- This message was sent by Atlassian Jira (v8.3.4#803005)