Hello, Il Gio 6 Feb 2025, 13:29 Rounak Kakkar (rkakkar) <[email protected]> ha scritto:
> Hello Team, > > We are using Zookeeper in cluster mode. For one of the paths, we notice a > data inconsistency between a follower node and the other nodes in the > system. We observe that the version for the path increases alongside > content updates. However, there are four additional entries in that path in > the follower where the issue is present. The version consistently lags four > updates behind both the leader and the other followers. > > If we try to delete one of the extra entries from that follower node, it > fails with the error: Node does not exist. > It is not possible to delete data from a follower, how did you do ? > > What could cause this issue? Could you please help check this? > You could try to nuke the broken node and let it download a fresh new snapshot from the peers. Don't do this in production! Please practice about the procedure and do it in production only when you are confident about the steps (Please do not cross post to dev and user mailing lists) Enrico > > Logs: > > Leader: > > vmanage@60f9084accc9:/var/lib/zookeeper$ > /var/lib/zookeeper/bin/zkServer.sh status > ZooKeeper JMX enabled by default > Using config: /var/lib/zookeeper/bin/../conf/zoo.cfg > Client port found: 2181. Client address: localhost. Client SSL: false. > Mode: leader > vmanage@60f9084accc9:/var/lib/zookeeper$ > > [zk: 127.0.0.1:2181(CONNECTED) 1] stat -w > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > cZxid = 0x13000611b0 > ctime = Thu Jun 06 13:04:52 UTC 2024 > mZxid = 0x13000611b0 > mtime = Thu Jun 06 13:04:52 UTC 2024 > pZxid = 0x9109c98fb2 > cversion = 2332172 > dataVersion = 0 > aclVersion = 0 > ephemeralOwner = 0x0 > dataLength = 0 > numChildren = 22 > > > [zk: 127.0.0.1:2181(CONNECTED) 2] ls > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > [20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, > 20250206_0_700_145, 20250206_686_686_0, 20250206_687_687_0, > 20250206_688_688_0, 20250206_689_689_0, 20250206_690_690_0, > 20250206_691_691_0, 20250206_692_692_0, 20250206_693_693_0, > 20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0, > 20250206_697_697_0, 20250206_698_698_0, 20250206_699_699_0, > 20250206_700_700_0, 20250206_701_701_0, 20250206_702_702_0, > 20250206_703_703_0] > > > [zk: 127.0.0.1:2181(CONNECTED) 3] stat -w > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > cZxid = 0x13000611b0 > ctime = Thu Jun 06 13:04:52 UTC 2024 > mZxid = 0x13000611b0 > mtime = Thu Jun 06 13:04:52 UTC 2024 > pZxid = 0x9109c9b371 > cversion = 2332178 > dataVersion = 0 > aclVersion = 0 > ephemeralOwner = 0x0 > dataLength = 0 > numChildren = 26 > > > [zk: 127.0.0.1:2181(CONNECTED) 4] ls > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > [20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, > 20250206_0_700_145, 20250206_0_705_146, 20250206_686_686_0, > 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0, > 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, > 20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, > 20250206_696_696_0, 20250206_697_697_0, 20250206_698_698_0, > 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0, > 20250206_702_702_0, 20250206_703_703_0, 20250206_704_704_0, > 20250206_705_705_0, 20250206_706_706_0] > > Follower-1 (No issue seen): > > vmanage@aacf5ef6555b:/var/lib/zookeeper$ > /var/lib/zookeeper/bin/zkServer.sh status > ZooKeeper JMX enabled by default > Using config: /var/lib/zookeeper/bin/../conf/zoo.cfg > Client port found: 2181. Client address: localhost. Client SSL: false. > Mode: follower > vmanage@aacf5ef6555b:/var/lib/zookeeper$ > > [zk: 127.0.0.1:2181(CONNECTED) 1] stat -w > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > cZxid = 0x13000611b0 > ctime = Thu Jun 06 13:04:52 UTC 2024 > mZxid = 0x13000611b0 > mtime = Thu Jun 06 13:04:52 UTC 2024 > pZxid = 0x9109c98fb2 > cversion = 2332172 > dataVersion = 0 > aclVersion = 0 > ephemeralOwner = 0x0 > dataLength = 0 > numChildren = 22 > > > [zk: 127.0.0.1:2181(CONNECTED) 2] ls > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > [20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, > 20250206_0_700_145, 20250206_686_686_0, 20250206_687_687_0, > 20250206_688_688_0, 20250206_689_689_0, 20250206_690_690_0, > 20250206_691_691_0, 20250206_692_692_0, 20250206_693_693_0, > 20250206_694_694_0, 20250206_695_695_0, 20250206_696_696_0, > 20250206_697_697_0, 20250206_698_698_0, 20250206_699_699_0, > 20250206_700_700_0, 20250206_701_701_0, 20250206_702_702_0, > 20250206_703_703_0] > > [zk: 127.0.0.1:2181(CONNECTED) 3] stat -w > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > cZxid = 0x13000611b0 > ctime = Thu Jun 06 13:04:52 UTC 2024 > mZxid = 0x13000611b0 > mtime = Thu Jun 06 13:04:52 UTC 2024 > pZxid = 0x9109c9b371 > cversion = 2332178 > dataVersion = 0 > aclVersion = 0 > ephemeralOwner = 0x0 > dataLength = 0 > numChildren = 26 > > > [zk: 127.0.0.1:2181(CONNECTED) 4] ls > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > [20250206_0_685_142, 20250206_0_690_143, 20250206_0_695_144, > 20250206_0_700_145, 20250206_0_705_146, 20250206_686_686_0, > 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0, > 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, > 20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, > 20250206_696_696_0, 20250206_697_697_0, 20250206_698_698_0, > 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0, > 20250206_702_702_0, 20250206_703_703_0, 20250206_704_704_0, > 20250206_705_705_0, 20250206_706_706_0] > > > Follower-2 (Issue seen): > > vmanage@fc890e903696:/var/lib/zookeeper$ > /var/lib/zookeeper/bin/zkServer.sh status > ZooKeeper JMX enabled by default > Using config: /var/lib/zookeeper/bin/../conf/zoo.cfg > Client port found: 2181. Client address: localhost. Client SSL: false. > Mode: follower > vmanage@fc890e903696:/var/lib/zookeeper$ > > [zk: 127.0.0.1:2181(CONNECTED) 1] stat -w > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > cZxid = 0x13000611b0 > ctime = Thu Jun 06 13:04:52 UTC 2024 > mZxid = 0x13000611b0 > mtime = Thu Jun 06 13:04:52 UTC 2024 > pZxid = 0x9109c98fb2 > cversion = 2332168 > dataVersion = 0 > aclVersion = 0 > ephemeralOwner = 0x0 > dataLength = 0 > numChildren = 26 > [zk: 127.0.0.1:2181(CONNECTED) 2] > > > [zk: 127.0.0.1:2181(CONNECTED) 2] ls > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > [20250130_0_4587_333, 20250130_0_4592_334, 20250130_4443_4565_11, > 20250130_4560_4560_0, 20250206_0_685_142, 20250206_0_690_143, > 20250206_0_695_144, 20250206_0_700_145, 20250206_686_686_0, > 20250206_687_687_0, 20250206_688_688_0, 20250206_689_689_0, > 20250206_690_690_0, 20250206_691_691_0, 20250206_692_692_0, > 20250206_693_693_0, 20250206_694_694_0, 20250206_695_695_0, > 20250206_696_696_0, 20250206_697_697_0, 20250206_698_698_0, > 20250206_699_699_0, 20250206_700_700_0, 20250206_701_701_0, > 20250206_702_702_0, 20250206_703_703_0] > > > > [zk: 127.0.0.1:2181(CONNECTED) 3] stat -w > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > cZxid = 0x13000611b0 > ctime = Thu Jun 06 13:04:52 UTC 2024 > mZxid = 0x13000611b0 > mtime = Thu Jun 06 13:04:52 UTC 2024 > pZxid = 0x9109c9b371 > cversion = 2332174 > dataVersion = 0 > aclVersion = 0 > ephemeralOwner = 0x0 > dataLength = 0 > numChildren = 30 > [zk: 127.0.0.1:2181(CONNECTED) 4] > > > [zk: 127.0.0.1:2181(CONNECTED) 4] ls > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts > [20250130_0_4587_333, 20250130_0_4592_334, 20250130_4443_4565_11, > 20250130_4560_4560_0, 20250206_0_685_142, 20250206_0_690_143, > 20250206_0_695_144, 20250206_0_700_145, 20250206_0_705_146, > 20250206_686_686_0, 20250206_687_687_0, 20250206_688_688_0, > 20250206_689_689_0, 20250206_690_690_0, 20250206_691_691_0, > 20250206_692_692_0, 20250206_693_693_0, 20250206_694_694_0, > 20250206_695_695_0, 20250206_696_696_0, 20250206_697_697_0, > 20250206_698_698_0, 20250206_699_699_0, 20250206_700_700_0, > 20250206_701_701_0, 20250206_702_702_0, 20250206_703_703_0, > 20250206_704_704_0, 20250206_705_705_0, 20250206_706_706_0] > > [zk: 127.0.0.1:2181(CONNECTED) 5] delete > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts/20250130_0_4587_333 > Node does not exist: > /clickhouse/tables/shard_2/aggregated_apps_dpi_app_summary_7f1396c5_a4d8_48c9_a3f3_90271305c114/replicas/replica_2/parts/20250130_0_4587_333 > > > Thanks, > Rounak >
