Replica fetcher not fetching post rolling reboots

Maruthi Vemuri (mavemuri) Mon, 04 Oct 2021 18:42:14 -0700

Hello,

We are seeing an issue on rolling restarts where replicas of a few partitions 
are lagging and never catchup. The log files for these partitions look the same 
size on all the brokers- including the ones where the replicas are lagging. The 
failedpartitionscount metric is still at 0 but the replicas are stuck in that 
state until we manually either reassign partitions or reelect leader. Some of 
the partitions in question don’t even receive any data during the rolling 
reboots. These partitions have min.insync.replicas set at 1 but even then is it 
not expected that the replicas eventually catchup to the leader? As far as I 
could make out, ReplicaFetcherThread just stopped fetching for those partitions


Has anyone seen a similar issue?

Thanks,
Maruthi

Replica fetcher not fetching post rolling reboots

Reply via email to