I am asking this because I want to propose a change to Kafka. But looks like in 
certain scenario it is very hard to not loss or duplication messages. Wonder in 
what scenario we can accept that and where to draw the line?

________________________________
From: De Gao <d...@live.co.uk>
Sent: 21 November 2023 6:25
To: dev@kafka.apache.org <dev@kafka.apache.org>
Subject: Re: How Kafka handle partition leader change?

Thanks Andrew.  Sounds like the leadership change from Kafka side is a 'best 
effort' to avoid message duplicate or loss. Can we say that message lost is 
very likely during leadership change unless producer uses idempotency? Is this 
a generic situation that no intent to provide data integration guarantee upon 
metadata change?
________________________________
From: Andrew Grant <agr...@confluent.io.INVALID>
Sent: 20 November 2023 12:26
To: dev@kafka.apache.org <dev@kafka.apache.org>
Subject: Re: How Kafka handle partition leader change?

Hey De Gao,

The controller is the one that always elects a new leader. When that happens 
that metadata is changed on the controller and once committed it’s broadcast to 
all brokers in the cluster. In KRaft this would be via a PartitonChange record 
that each broker will fetch from the controller. In ZK it’d be via an RPC from 
the controller to the broker.

In either case each broker might get the notification at a different time. No 
ordering guarantee among the brokers. But eventually they’ll all know the new 
leader which means eventually the Produce will fail with NotLeader and the 
client will refresh its metadata and find out the new one.

In between all that leadership movement, there are various ways messages can 
get duplicated or lost. However if you use the idempotent producer I believe 
you actually won’t see dupes or missing messages so if that’s an important 
requirement you could look into that. The producer is designed to retry in 
general and when you use the idempotent producer some extra metadata is sent 
around to dedupe any messages server-side that were sent multiple times by the 
client.

If you’re interested in learning more Kafka internals I highly recommend this 
blog series 
https://www.confluent.io/blog/apache-kafka-architecture-and-internals-by-jun-rao/

Hope that helped a bit.

Andy

Sent from my iPhone

> On Nov 20, 2023, at 2:07 AM, De Gao <d...@live.co.uk> wrote:
>
> Hi all I have a interesting question here.
>
> Let's say we have 2 broker B1 B2, controller C and producer P1, P2...Pn. 
> Currently B1 holds the partition leader and Px is constantly producing 
> messages to B1. We want to move the partition leadership to B2. How does the 
> leadership change synced between B1, B2, C, and Px that it is guaranteed that 
> all the parties acknowledged the leadership change in the right order? Was 
> there a break of produce flow in between? Any chance of  message lost?
>
> Thanks
>
> De Gao

Reply via email to