[ https://issues.apache.org/jira/browse/KAFKA-8733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183304#comment-17183304 ]
Flavien Raynaud commented on KAFKA-8733: ---------------------------------------- Has there been any update regarding this issue/the associated KIP? I can see that the thread on the mailing list has been empty for the past 6 months. It has happened again recently when one broker ecountered a disk failure, causing a bunch of offline partitions. Happy to help in any way we can 😄 > Offline partitions occur when leader's disk is slow in reads while responding > to follower fetch requests. > --------------------------------------------------------------------------------------------------------- > > Key: KAFKA-8733 > URL: https://issues.apache.org/jira/browse/KAFKA-8733 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 1.1.2, 2.4.0 > Reporter: Satish Duggana > Assignee: Satish Duggana > Priority: Critical > Attachments: weighted-io-time-2.png, wio-time.png > > > We found offline partitions issue multiple times on some of the hosts in our > clusters. After going through the broker logs and hosts’s disk stats, it > looks like this issue occurs whenever the read/write operations take more > time on that disk. In a particular case where read time is more than the > replica.lag.time.max.ms, follower replicas will be out of sync as their > earlier fetch requests are stuck while reading the local log and their fetch > status is not yet updated as mentioned in the below code of `ReplicaManager`. > If there is an issue in reading the data from the log for a duration more > than replica.lag.time.max.ms then all the replicas will be out of sync and > partition becomes offline if min.isr.replicas > 1 and unclean.leader.election > is false. >  > {code:java} > def readFromLog(): Seq[(TopicPartition, LogReadResult)] = { >  val result = readFromLocalLog( // this call took more than > `replica.lag.time.max.ms` >  replicaId = replicaId, >  fetchOnlyFromLeader = fetchOnlyFromLeader, >  readOnlyCommitted = fetchOnlyCommitted, >  fetchMaxBytes = fetchMaxBytes, >  hardMaxBytesLimit = hardMaxBytesLimit, >  readPartitionInfo = fetchInfos, >  quota = quota, >  isolationLevel = isolationLevel) >  if (isFromFollower) updateFollowerLogReadResults(replicaId, result). // > fetch time gets updated here, but mayBeShrinkIsr should have been already > called and the replica is removed from isr > else result > } > val logReadResults = readFromLog() > {code} > Attached the graphs of disk weighted io time stats when this issue occurred. > I will raise [KIP-501|https://s.apache.org/jhbpn] describing options on how > to handle this scenario. >  -- This message was sent by Atlassian Jira (v8.3.4#803005)