Pretty happy to see a KIP tackling this problem! One comment below.

The "Extending LeaderEpoch to include Returning Leaders" states:
"To protect against this eventuality the controller will maintain a cached
mapping of [broker -> Zookeeper CZXID] (CZXID is a unique and monotonic
64-bit number) for the broker’s registration in Zookeeper
(/brokers/ids/[brokerId]). If the controller receives a Broker Registration
where the CZXID has changed it will increment the Leader Epoch and
propagate that value to the broker via the Leader and ISR Request (in the
normal way), then update the cached CZXID for that broker."

In general I think kafka underutilizes zookeeper's various flavors of zxids
but this time it's not clear to me what the motivation is for maintaining
the broker to czxid mapping. It seems that the following check is
redundant: "If the controller receives a Broker Registration where the
CZXID has changed". By definition, the czxid of the /brokers/ids/[brokerId]
znode will always change upon successful broker registration (
https://zookeeper.apache.org/doc/r3.4.8/zookeeperProgrammers.html#sc_zkStatStructure).
Why maintain the mapping at all? Why not just always update leader epochs
and propagate every time the controller receives the broker registration zk
event?

On Sun, Dec 11, 2016 at 2:30 PM, Neha Narkhede <n...@confluent.io> wrote:

> Good to see this KIP being proposed. Back when I added the epoch to the
> replication protocol, we discussed adding it to the log due to the failure
> scenarios listed in the KIP but I failed to convince people that it was
> worth the effort needed to upgrade the cluster (especially after we asked
> people to go through a painful backwards incompatible upgrade for 0.8 :-))
> The lack of including the leader epoch/generation in the log has also been
> one of the biggest critiques of Kafka's replication protocol by the
> distributed systems community.
>
> I'm in favor of this work though I think we shouldn't end up with 2 notions
> of representing a leader's generation. When we added the epoch, we wanted
> to add it to the log but we didn't. Now that we are adding the generation
> id to the log, I think we should revisit calling it the epoch at all. Have
> you thought about a way to evolve the epoch to the generation id throughout
> and what it will take?
>
> On Sun, Dec 11, 2016 at 4:31 AM Ben Stopford <b...@confluent.io> wrote:
>
> > Hi All
> >
> > Please find the below KIP which describes a proposed solution to a couple
> > of issues that have been observed with the replication protocol.
> >
> > In short, the proposal replaces the use of the High Watermark, for
> > follower log trunctation, with an alternate Generation Marker. This
> > uniquely defines which leader messages were acknowledged by.
> >
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 101+-+Alter+Replication+Protocol+to+use+Leader+
> Generation+rather+than+High+Watermark+for+Truncation
> > <
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 101+-+Alter+Replication+Protocol+to+use+Leader+
> Generation+rather+than+High+Watermark+for+Truncation
> > >
> >
> > All comments and suggestions greatly appreciated.
> >
> > Ben Stopford
> > Confluent, http://www.confluent.io <http://www.confluent.io/>
> >
> > --
> Thanks,
> Neha
>

Reply via email to