Satish, we encounter this frequently and consider it a major bug. Your solution makes sense to me.
Ryanne On Tue, Jun 22, 2021, 7:29 PM Satish Duggana <satish.dugg...@gmail.com> wrote: > Hi, > Bumping up the discussion thread on KIP-501 about avoiding out-of-sync or > offline partitions when follower fetch requests are not processed in time > by the leader replica. This issue occurred several times in multiple > production environments (at Uber, Yelp, Twitter, etc). > > KIP-501 is located here > < > https://cwiki.apache.org/confluence/display/KAFKA/KIP-501+Avoid+out-of-sync+or+offline+partitions+when+follower+fetch+requests+are+not+processed+in+time > >. > You may want to look at the earlier mail discussion thread here > < > https://mail-archives.apache.org/mod_mbox/kafka-dev/202002.mbox/%3Cpony-9f4e96e457398374499ab892281453dcaa7dc679-11722f366b06d9f46bcb5905ff94fd6ab167598e%40dev.kafka.apache.org%3E > >, > and here > < > https://mail-archives.apache.org/mod_mbox/kafka-dev/202002.mbox/%3CCAM-aUZnJ4z%2B_ztjF6sXSL61M1me0ogWZ1BV6%2BoV45rJMG8EoZA%40mail.gmail.com%3E > > > . > > Please take a look, I would like to hear your feedback and suggestions. > > Thanks, > Satish. >