Albert Strasheim created KAFKA-1509:
---------------------------------------
Summary: Restart of destination broker after partition move leaves
partitions without leader
Key: KAFKA-1509
URL: https://issues.apache.org/jira/browse/KAFKA-1509
Project: Kafka
Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Albert Strasheim
This should be reasonably easy to reproduce.
Make a Kafka cluster with a few machines.
Create a topic with partitions on these machines. No replication.
Bring up one more Kafka node.
Move some or all of the partitions onto this new broker:
kafka-reassign-partitions.sh --generate --zookeeper zk:2181
--topics-to-move-json-file move.json --broker-list <new broker>
kafka-reassign-partitions.sh --zookeeper 36cfqd1.in.cfops.it:2181
--reassignment-json-file reassign.json --execute
Wait until broker is the leader for all the partitions you moved.
Send some data to the partitions. It all works.
Shut down the broker that just received the data. Start it back up.
{code}
Topic:test PartitionCount:2 ReplicationFactor:1 Configs:
Topic: test Partition: 0 Leader: -1 Replicas: 7 Isr:
Topic: test Partition: 1 Leader: -1 Replicas: 7 Isr:
{code}
Leader for topic test never gets elected even though this node is the only node
knows about the topic.
Some logs:
{code}
Jun 26 23:18:07 localhost kafka: INFO [Socket Server on Broker 7], Started
(kafka.network.SocketServer)
Jun 26 23:18:07 localhost kafka: INFO [Socket Server on Broker 7], Started
(kafka.network.SocketServer)
Jun 26 23:18:07 localhost kafka: INFO [ControllerEpochListener on 7]:
Initialized controller epoch to 53 and zk version 52
(kafka.controller.ControllerEpochListener)
Jun 26 23:18:07 localhost kafka: INFO Will not load MX4J, mx4j-tools.jar is not
in the classpath (kafka.utils.Mx4jLoader$)
Jun 26 23:18:07 localhost kafka: INFO Will not load MX4J, mx4j-tools.jar is not
in the classpath (kafka.utils.Mx4jLoader$)
Jun 26 23:18:07 localhost kafka: INFO [Controller 7]: Controller starting up
(kafka.controller.KafkaController)
Jun 26 23:18:07 localhost kafka: INFO conflict in /controller data:
{"version":1,"brokerid":7,"timestamp":"1403824687354"} stored data:
{"version":1,"brokerid":4,"timestamp":"1403297911725"} (kafka.utils.ZkUtils$)
Jun 26 23:18:07 localhost kafka: INFO conflict in /controller data:
{"version":1,"brokerid":7,"timestamp":"1403824687354"} stored data:
{"version":1,"brokerid":4,"timestamp":"1403297911725"} (kafka.utils.ZkUtils$)
Jun 26 23:18:07 localhost kafka: INFO [Controller 7]: Controller startup
complete (kafka.controller.KafkaController)
Jun 26 23:18:07 localhost kafka: INFO Registered broker 7 at path
/brokers/ids/7 with address xxx:9092. (kafka.utils.ZkUtils$)
Jun 26 23:18:07 localhost kafka: INFO Registered broker 7 at path
/brokers/ids/7 with address xxx:9092. (kafka.utils.ZkUtils$)
Jun 26 23:18:07 localhost kafka: INFO [Kafka Server 7], started
(kafka.server.KafkaServer)
Jun 26 23:18:07 localhost kafka: INFO [Kafka Server 7], started
(kafka.server.KafkaServer)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:3)
for partition [requests,0] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:1)
for partition [requests,13] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:4,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:1,5)
for partition [requests_ipv6,5] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:13,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:4,5)
for partition [requests_stored,7] in response to UpdateMetadata request sent
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
for partition [test,1] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:5,ISR:5,LeaderEpoch:17,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:5)
for partition [requests,6] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:7,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:1,4)
for partition [requests_ipv6,0] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:6,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:2,1)
for partition [requests_ipv6,6] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:5,ISR:5,LeaderEpoch:17,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:5)
for partition [requests,10] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:1,5)
for partition [requests_stored,4] in response to UpdateMetadata request sent
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:3,2)
for partition [requests_stored,1] in response to UpdateMetadata request sent
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:13,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:3,4)
for partition [requests_stored,6] in response to UpdateMetadata request sent
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:1)
for partition [requests,14] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:5,ISR:5,LeaderEpoch:17,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:5)
for partition [requests,2] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
for partition [test3,0] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:2,ISR:2,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:2)
for partition [requests,3] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:9,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:3,2)
for partition [requests_ipv6,7] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:9,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:3,1)
for partition [requests_ipv6,2] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:5,3)
for partition [requests_ipv6,4] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:15,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:4)
for partition [requests,5] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:1)
for partition [requests,4] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:2,ISR:2,LeaderEpoch:4,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:2,5)
for partition [requests_ipv6,1] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:15,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:4)
for partition [requests,9] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:4,2)
for partition [requests_ipv6,3] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:2,ISR:2,LeaderEpoch:15,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:2)
for partition [requests,11] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:2,3)
for partition [requests_stored,5] in response to UpdateMetadata request sent
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:4,2)
for partition [requests_error,1] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
for partition [test3,1] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:5,4)
for partition [requests_stored,3] in response to UpdateMetadata request sent
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:17,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:4,3)
for partition [requests_stored,2] in response to UpdateMetadata request sent
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:2,ISR:2,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:2)
for partition [requests,7] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:3,1)
for partition [requests_error,0] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:2,1)
for partition [requests_stored,0] in response to UpdateMetadata request sent
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:6,ISR:6,LeaderEpoch:21,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:6)
for partition [requests,1] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:6,ISR:6,LeaderEpoch:24,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:6)
for partition [requests,12] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
for partition [test,0] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:3)
for partition [requests,8] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 received LeaderAndIsr request
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
correlation id 71 from controller 4 epoch 53 for partition [test,0]
(state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 received LeaderAndIsr request
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
correlation id 71 from controller 4 epoch 53 for partition [test3,1]
(state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 received LeaderAndIsr request
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
correlation id 71 from controller 4 epoch 53 for partition [test,1]
(state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 received LeaderAndIsr request
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
correlation id 71 from controller 4 epoch 53 for partition [test3,0]
(state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 handling LeaderAndIsr request
correlationId 71 from controller 4 epoch 53 starting the become-follower
transition for partition [test,0] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 handling LeaderAndIsr request
correlationId 71 from controller 4 epoch 53 starting the become-follower
transition for partition [test3,1] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 handling LeaderAndIsr request
correlationId 71 from controller 4 epoch 53 starting the become-follower
transition for partition [test,1] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 handling LeaderAndIsr request
correlationId 71 from controller 4 epoch 53 starting the become-follower
transition for partition [test3,0] (state.change.logger)
Jun 26 23:18:07 localhost kafka: ERROR Broker 7 aborted the become-follower
state change with correlation id 71 from controller 4 epoch 53 for partition
[test,0] since new leader -1 is not currently available (state.change.logger)
Jun 26 23:18:07 localhost kafka: ERROR Broker 7 aborted the become-follower
state change with correlation id 71 from controller 4 epoch 53 for partition
[test3,1] since new leader -1 is not currently available (state.change.logger)
Jun 26 23:18:07 localhost kafka: ERROR Broker 7 aborted the become-follower
state change with correlation id 71 from controller 4 epoch 53 for partition
[test,1] since new leader -1 is not currently available (state.change.logger)
Jun 26 23:18:07 localhost kafka: ERROR Broker 7 aborted the become-follower
state change with correlation id 71 from controller 4 epoch 53 for partition
[test3,0] since new leader -1 is not currently available (state.change.logger)
Jun 26 23:18:07 localhost kafka: INFO [ReplicaFetcherManager on broker 7]
Removed fetcher for partitions (kafka.server.ReplicaFetcherManager)
Jun 26 23:18:07 localhost kafka: INFO [ReplicaFetcherManager on broker 7]
Removed fetcher for partitions (kafka.server.ReplicaFetcherManager)
Jun 26 23:18:07 localhost kafka: INFO [ReplicaFetcherManager on broker 7] Added
fetcher for partitions List() (kafka.server.ReplicaFetcherManager)
Jun 26 23:18:07 localhost kafka: INFO [ReplicaFetcherManager on broker 7] Added
fetcher for partitions List() (kafka.server.ReplicaFetcherManager)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 completed LeaderAndIsr request
correlationId 71 from controller 4 epoch 53 for the become-follower transition
for partition [test,0] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 completed LeaderAndIsr request
correlationId 71 from controller 4 epoch 53 for the become-follower transition
for partition [test3,1] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 completed LeaderAndIsr request
correlationId 71 from controller 4 epoch 53 for the become-follower transition
for partition [test,1] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 completed LeaderAndIsr request
correlationId 71 from controller 4 epoch 53 for the become-follower transition
for partition [test3,0] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
for partition [test,0] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 71 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
for partition [test3,1] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 71 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
for partition [test,1] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 71 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
for partition [test3,0] in response to UpdateMetadata request sent by
controller 4 epoch 53 with correlation id 71 (state.change.logger)
{code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)