Hey, we had the same issue as you.
I checked the code and it chooses the first live replica from the assignment list. So if you describe a topic with kafka-topics, you will see the brokers list that has the replica of each partition. For example: [1001, 1002, 1003]. If that is the list, Kafka will choose the first replica that is available (is online) in that list. We use "acks=all" and "min.insync.replicas=2", so that should mean that even if the leader is down and the rest of the replicas fall out of the ISR, one of the follower replicas should have up to date data. You can compare the two follower replicas with kafka-dump-tool to see which are more up-to-date. If you run a partition reassignment, you can change the order of the followers in the assignment list and then trigger an unclean leader election for the reassigned partitions. So it seems that this way, assuming the use of "acks=all" and "min.insync.replicas=2", we can recover without data loss. But only if my above assumption is correct. And please test this before using on live data. Peter On Mon, 28 Jun 2021 at 09:53, Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Mon, Jun 21, 2021 at 12:33 PM Oleksandr Shulgin < > oleksandr.shul...@zalando.de> wrote: > > > > In summary: is there a risk of data loss in such a scenario? Is this > risk avoidable and if so, what are > > the prerequisites? > > Apologies if I messed up line breaks and that made reading harder. O:-) > > The question boils down to: is replica selection completely random in case > of unclean leader election or not? > > > Regards, > -- > Alex >