Rounak Kakkar created ZOOKEEPER-4896: ----------------------------------------
Summary: Inconsistency for a ZK Path in the cluster Key: ZOOKEEPER-4896 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4896 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.8.4 Environment: Zookeeper is deployed in the cluster environment. There are three nodes in the cluster. Attached the config Reporter: Rounak Kakkar Attachments: zk_config_stats, zk_path_issue We are using Zookeeper in cluster mode. For one of the paths, we notice a data inconsistency between a follower node and the other nodes in the system. We observe that the version for the path increases alongside content updates. However, there are four additional entries in that path in the follower where the issue is present. The version consistently lags four updates behind both the leader and the other followers. If we try to delete one of the extra entries from that follower node, it fails with the error: Node does not exist. *Leader:* {{{}vmanage@60f9084accc9:/var/lib/zookeeper$ /var/lib/zookeeper/bin/zkServer.sh status{}}}{{{}ZooKeeper JMX enabled by default{}}}{{{}Using config: /var/lib/zookeeper/bin/../conf/zoo.cfg{}}}{{{}Client port found: 2181. Client address: localhost. Client SSL: false.{}}}{{{}Mode: leader{}}}{{{}vmanage@60f9084accc9:/var/lib/zookeeper${}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 1] stat -w /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid = 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid = 0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid = 0x9109c98fb2{}}}{{{}cversion = 2332172{}}}{{{}dataVersion = 0{}}}{{{}aclVersion = 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren = 22{}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 2] ls /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145, 20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0, 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, 20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0, 20250206_697_697_0, 20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0, 20250206_702_702_0, 20250206_703_703_0]{}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 3] stat -w /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid = 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid = 0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid = 0x9109c9b371{}}}{{{}cversion = 2332178{}}}{{{}dataVersion = 0{}}}{{{}aclVersion = 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren = 26{}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 4] ls /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145, 20250206_0_705_146, 20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0, 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, 20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0, 20250206_697_697_0, 20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0, 20250206_702_702_0, 20250206_703_703_0, 20250206_704_704_0, 20250206_705_705_0, 20250206_706_706_0]{}}} *Follower-1 (No issue seen):* {{{}vmanage@aacf5ef6555b:/var/lib/zookeeper$ /var/lib/zookeeper/bin/zkServer.sh status{}}}{{{}ZooKeeper JMX enabled by default{}}}{{{}Using config: /var/lib/zookeeper/bin/../conf/zoo.cfg{}}}{{{}Client port found: 2181. Client address: localhost. Client SSL: false.{}}}{{{}Mode: follower{}}}{{{}vmanage@aacf5ef6555b:/var/lib/zookeeper${}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 1] stat -w /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid = 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid = 0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid = 0x9109c98fb2{}}}{{{}cversion = 2332172{}}}{{{}dataVersion = 0{}}}{{{}aclVersion = 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren = 22{}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 2] ls /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145, 20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0, 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, 20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0, 20250206_697_697_0, 20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0, 20250206_702_702_0, 20250206_703_703_0]{}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 3] stat -w /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid = 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid = 0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid = 0x9109c9b371{}}}{{{}cversion = 2332178{}}}{{{}dataVersion = 0{}}}{{{}aclVersion = 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren = 26{}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 4] ls /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145, 20250206_0_705_146, 20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0, 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, 20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0, 20250206_697_697_0, 20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0, 20250206_702_702_0, 20250206_703_703_0, 20250206_704_704_0, 20250206_705_705_0, 20250206_706_706_0]{}}} *Follower-2 (Issue seen):* {{{}vmanage@fc890e903696:/var/lib/zookeeper$ /var/lib/zookeeper/bin/zkServer.sh status{}}}{{{}ZooKeeper JMX enabled by default{}}}{{{}Using config: /var/lib/zookeeper/bin/../conf/zoo.cfg{}}}{{{}Client port found: 2181. Client address: localhost. Client SSL: false.{}}}{{{}Mode: follower{}}}{{{}vmanage@fc890e903696:/var/lib/zookeeper${}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 1] stat -w /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid = 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid = 0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid = 0x9109c98fb2{}}}{{{}cversion = 2332168{}}}{{{}dataVersion = 0{}}}{{{}aclVersion = 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren = 26{}}}{{{}[zk: 127.0.0.1:2181(CONNECTED) 2]{}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 2] ls /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250130_0_4587_333, 20250130_0_4592_334, 20250130_4443_4565_11, 20250130_4560_4560_0, 20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145, 20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0, 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, 20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0, 20250206_697_697_0, 20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0, 20250206_702_702_0, 20250206_703_703_0]{}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 3] stat -w /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid = 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid = 0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid = 0x9109c9b371{}}}{{{}cversion = 2332174{}}}{{{}dataVersion = 0{}}}{{{}aclVersion = 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren = 30{}}}{{{}[zk: 127.0.0.1:2181(CONNECTED) 4]{}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 4] ls /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250130_0_4587_333, 20250130_0_4592_334, 20250130_4443_4565_11, 20250130_4560_4560_0, 20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145, 20250206_0_705_146, 20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0, 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, 20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0, 20250206_697_697_0, 20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0, 20250206_702_702_0, 20250206_703_703_0, 20250206_704_704_0, 20250206_705_705_0, 20250206_706_706_0]{}}} {{{}[zk: 127.0.0.1:2181(CONNECTED) 5] delete /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts/20250130_0_4587_333{}}}{{{}Node does not exist: /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts/20250130_0_4587_333{}}} -- This message was sent by Atlassian Jira (v8.20.10#820010)