Hi José,

Thanks for the KIP.

I have one question regarding how fetch from followers will
work when the leader is recovering. My understanding is that
the leader will reject any produce and fetch requests with a
NOT_LEADER_OR_FOLLOWER error while the followers
will fence any fetch requests based on the incremented leader
epoch. That seems OK for recent consumers from a correctness
perspective but it might be a little weird for older consumers which
do not set the leader epoch in the fetch request (prior to v9). They
would be able to fetch from the followers while the leader recovers
if I understand it correctly. It might be good to clarify this case in
the KIP. What do you think?


Best,
David

On Fri, Jan 28, 2022 at 7:18 PM José Armando García Sancio
<jsan...@confluent.io.invalid> wrote:
>
> Hi all,
>
> Jason and I discussed this offline. At a high-level I have made the
> following changes to the KIP.
>
> 1. IBP will be used to enable this feature and to determine which
> version of LeaderAndIsr and AlterPartition will be used.
> 2. The LeaderRecoveryState field for LeaderAndIsr and AlterPartition
> is not marked as ignorable.
>
> If the controller sees an AlterPartition is a version of 0, it will
> assume that the leader has recovered.
>
> If the leader gets a RECOVERING for the LeaderRecoveryState it will
> attempt to recover the partition irrespective of the IBP. When it has
> recovered, depending on the IBP it will send the right version of
> AlterPartition.
>
> KIP Diff: 
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=173082256&selectedPageVersions=19&selectedPageVersions=18
>
> Thanks,
> -José

Reply via email to