[ 
https://issues.apache.org/jira/browse/KAFKA-8702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Falko updated KAFKA-8702:
--------------------------------
    Affects Version/s: 2.3.0

> Kafka leader election doesn't happen when leader broker port is partitioned 
> off the network
> -------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-8702
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8702
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.1.0, 2.3.0
>            Reporter: Andrey Falko
>            Priority: Major
>
> We first started seeing this with 2.1.1 version of Kafka. We are currently on 
> 2.3.0. 
> We were able to actively reproduce this today on one of our staging 
> environments. There are three brokers in this environment, 0, 1, and 2. The 
> reproduction steps are as follows: 
>  1) Push some traffic to a topic that looks like this: 
>  $ bin/kafka-topics.sh --describe --zookeeper $(grep zookeeper.connect= 
> /kafka/config/server.properties | awk -F= '\{print $2}') --topic test 
>  Topic:test      PartitionCount:6        ReplicationFactor:3     
> Configs:cleanup.policy=delete,retention.ms=86400000 
>         Topic: test     Partition: 0    Leader: 0       Replicas: 2,0,1 Isr: 
> 0,1,2 
>         Topic: test     Partition: 1    Leader: 0       Replicas: 0,1,2 Isr: 
> 0,1,2 
>         Topic: test     Partition: 2    Leader: 1       Replicas: 1,2,0 Isr: 
> 1,2,0 
>         Topic: test     Partition: 3    Leader: 2       Replicas: 2,1,0 Isr: 
> 1,2,0 
>         Topic: test     Partition: 4    Leader: 0       Replicas: 0,2,1 Isr: 
> 0,1,2 
>         Topic: test     Partition: 5    Leader: 1       Replicas: 1,0,2 Isr: 
> 1,2,0
> 2) We proceed to run the following on broker 0:
>  iptables -D INPUT -j DROP -p tcp --destination-port 9093 && iptables -D 
> OUTPUT -j DROP -p tcp --destination-port 9093
>  Note: our replication and traffic from clients comes in on TLS protected 
> port 9093 only. 
> 3) Leadership doesn't change b/c Zookeeper connection is unaffected. However, 
> we start seeing URP. 
> 4) We reboot broker 0. We see offline partitions. Leadership never changes 
> and the cluster only recovers when broker 0 comes back online.
> Best regards,
>  Andrey Falko



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to