Rounak Kakkar created ZOOKEEPER-4896:
----------------------------------------
Summary: Inconsistency for a ZK Path in the cluster
Key: ZOOKEEPER-4896
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4896
Project: ZooKeeper
Issue Type: Bug
Affects Versions: 3.8.4
Environment: Zookeeper is deployed in the cluster environment. There
are three nodes in the cluster. Attached the config
Reporter: Rounak Kakkar
Attachments: zk_config_stats, zk_path_issue
We are using Zookeeper in cluster mode. For one of the paths, we notice a data
inconsistency between a follower node and the other nodes in the system. We
observe that the version for the path increases alongside content updates.
However, there are four additional entries in that path in the follower where
the issue is present. The version consistently lags four updates behind both
the leader and the other followers.
If we try to delete one of the extra entries from that follower node, it fails
with the error: Node does not exist.
*Leader:*
{{{}vmanage@60f9084accc9:/var/lib/zookeeper$ /var/lib/zookeeper/bin/zkServer.sh
status{}}}{{{}ZooKeeper JMX enabled by default{}}}{{{}Using config:
/var/lib/zookeeper/bin/../conf/zoo.cfg{}}}{{{}Client port found: 2181. Client
address: localhost. Client SSL: false.{}}}{{{}Mode:
leader{}}}{{{}vmanage@60f9084accc9:/var/lib/zookeeper${}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 1] stat -w
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid
= 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid =
0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid =
0x9109c98fb2{}}}{{{}cversion = 2332172{}}}{{{}dataVersion = 0{}}}{{{}aclVersion
= 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren =
22{}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 2] ls
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250206_0_685_142,
20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145,
20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0,
20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, 20250206_693_693_0,
20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0, 20250206_697_697_0,
20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0,
20250206_702_702_0, 20250206_703_703_0]{}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 3] stat -w
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid
= 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid =
0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid =
0x9109c9b371{}}}{{{}cversion = 2332178{}}}{{{}dataVersion = 0{}}}{{{}aclVersion
= 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren =
26{}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 4] ls
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250206_0_685_142,
20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145,
20250206_0_705_146, 20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0,
20250206_689_689_0, 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0,
20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0,
20250206_697_697_0, 20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0,
20250206_701_701_0, 20250206_702_702_0, 20250206_703_703_0, 20250206_704_704_0,
20250206_705_705_0, 20250206_706_706_0]{}}}
*Follower-1 (No issue seen):*
{{{}vmanage@aacf5ef6555b:/var/lib/zookeeper$ /var/lib/zookeeper/bin/zkServer.sh
status{}}}{{{}ZooKeeper JMX enabled by default{}}}{{{}Using config:
/var/lib/zookeeper/bin/../conf/zoo.cfg{}}}{{{}Client port found: 2181. Client
address: localhost. Client SSL: false.{}}}{{{}Mode:
follower{}}}{{{}vmanage@aacf5ef6555b:/var/lib/zookeeper${}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 1] stat -w
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid
= 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid =
0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid =
0x9109c98fb2{}}}{{{}cversion = 2332172{}}}{{{}dataVersion = 0{}}}{{{}aclVersion
= 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren =
22{}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 2] ls
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250206_0_685_142,
20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145,
20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0,
20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, 20250206_693_693_0,
20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0, 20250206_697_697_0,
20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0,
20250206_702_702_0, 20250206_703_703_0]{}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 3] stat -w
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid
= 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid =
0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid =
0x9109c9b371{}}}{{{}cversion = 2332178{}}}{{{}dataVersion = 0{}}}{{{}aclVersion
= 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren =
26{}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 4] ls
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250206_0_685_142,
20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145,
20250206_0_705_146, 20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0,
20250206_689_689_0, 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0,
20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0,
20250206_697_697_0, 20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0,
20250206_701_701_0, 20250206_702_702_0, 20250206_703_703_0, 20250206_704_704_0,
20250206_705_705_0, 20250206_706_706_0]{}}}
*Follower-2 (Issue seen):*
{{{}vmanage@fc890e903696:/var/lib/zookeeper$ /var/lib/zookeeper/bin/zkServer.sh
status{}}}{{{}ZooKeeper JMX enabled by default{}}}{{{}Using config:
/var/lib/zookeeper/bin/../conf/zoo.cfg{}}}{{{}Client port found: 2181. Client
address: localhost. Client SSL: false.{}}}{{{}Mode:
follower{}}}{{{}vmanage@fc890e903696:/var/lib/zookeeper${}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 1] stat -w
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid
= 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid =
0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid =
0x9109c98fb2{}}}{{{}cversion = 2332168{}}}{{{}dataVersion = 0{}}}{{{}aclVersion
= 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren =
26{}}}{{{}[zk: 127.0.0.1:2181(CONNECTED) 2]{}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 2] ls
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250130_0_4587_333,
20250130_0_4592_334, 20250130_4443_4565_11, 20250130_4560_4560_0,
20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145,
20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0,
20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, 20250206_693_693_0,
20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0, 20250206_697_697_0,
20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0,
20250206_702_702_0, 20250206_703_703_0]{}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 3] stat -w
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}cZxid
= 0x13000611b0{}}}{{{}ctime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}mZxid =
0x13000611b0{}}}{{{}mtime = Thu Jun 06 13:04:52 UTC 2024{}}}{{{}pZxid =
0x9109c9b371{}}}{{{}cversion = 2332174{}}}{{{}dataVersion = 0{}}}{{{}aclVersion
= 0{}}}{{{}ephemeralOwner = 0x0{}}}{{{}dataLength = 0{}}}{{{}numChildren =
30{}}}{{{}[zk: 127.0.0.1:2181(CONNECTED) 4]{}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 4] ls
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts{}}}{{{}[20250130_0_4587_333,
20250130_0_4592_334, 20250130_4443_4565_11, 20250130_4560_4560_0,
20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, 20250206_0_700_145,
20250206_0_705_146, 20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0,
20250206_689_689_0, 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0,
20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0,
20250206_697_697_0, 20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0,
20250206_701_701_0, 20250206_702_702_0, 20250206_703_703_0, 20250206_704_704_0,
20250206_705_705_0, 20250206_706_706_0]{}}}
{{{}[zk: 127.0.0.1:2181(CONNECTED) 5] delete
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts/20250130_0_4587_333{}}}{{{}Node
does not exist:
/clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts/20250130_0_4587_333{}}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)