许胜斌 created KAFKA-15446: --------------------------- Summary: Upgrading from 2.0 to 2.8, with replica out of sync exceeding 12 hours Key: KAFKA-15446 URL: https://issues.apache.org/jira/browse/KAFKA-15446 Project: Kafka Issue Type: Bug Components: replication Affects Versions: 2.8.2 Environment: centos7、java8 Reporter: 许胜斌 Attachments: image-2023-09-08-16-37-12-364.png
!image-2023-09-08-16-37-12-364.png! There are three brokers in the cluster. When the leader of the partition is node 0, it cannot be synchronized to nodes 1 and 2. This problem has lasted for more than ten hours, and the log.dir of the corresponding partition on nodes 1 and 2 has not been updated for a long time, indicating that data replication has stopped. However, when the leader of the partition is node 1 or node 2, it can be synchronized to other nodes. the error log is: [2023-09-08 16:35:05,238] WARN [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Reset fetch offset for partition msg_for_dispatche-0 from 3636534258 to current leader's start offset 14558984559 (kafka.server.ReplicaFetcherThread) [2023-09-08 16:35:05,238] INFO The cleaning for partition msg_for_dispatche-0 is aborted and paused (kafka.log.LogManager) [2023-09-08 16:35:05,238] INFO [Log partition=msg_for_dispatche-0, dir=/usr/local/kafka/kafka-logs] Deleting segments as part of log truncation: LogSegment(baseOffset=3636534258, size=0, lastModifiedTime=1694162105000, largestRecordTimestamp=None) (kafka.log.Log) [2023-09-08 16:35:05,241] INFO [Log partition=msg_for_dispatche-0, dir=/usr/local/kafka/kafka-logs] Loading producer state till offset 14558984559 with message format version 2 (kafka.log.Log) [2023-09-08 16:35:05,241] INFO Cleaning for partition msg_for_dispatche-0 is resumed (kafka.log.LogManager) -- This message was sent by Atlassian Jira (v8.20.10#820010)