[ 
https://issues.apache.org/jira/browse/KAFKA-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339225#comment-14339225
 ] 

Joel Koshy commented on KAFKA-1987:
-----------------------------------

Looking at the code I think this is possible when the follower receives the 
LeaderAndIsr request first, but is probably harmless since the replica fetcher 
will just wait for the leader to process its leader transition. We could have 
the controller wait until the leader responds before sending the leaderandisr 
to followers, but not sure if that is worth doing.

> Potential race condition in partition creation
> ----------------------------------------------
>
>                 Key: KAFKA-1987
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1987
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8.1.1
>            Reporter: Todd Palino
>            Assignee: Neha Narkhede
>
> I am finding that there appears to be a race condition when creating 
> partitions, with replication factor 2 or higher, between the creation of the 
> partition on the leader and the follower. What appears to be happening is 
> that the follower is processing the command to create the partition before 
> the leader does, and when the follower starts the replica fetcher, it fails 
> with an UnknownTopicOrPartitionException.
> The situation is that I am creating a large number of partitions on a 
> cluster, preparing it for data being mirrored from another cluster. So there 
> are a sizeable number of create and alter commands being sent sequentially. 
> Eventually, the replica fetchers start up properly. But it seems like the 
> controller should issue the command to create the partition to the leader, 
> wait for confirmation, and then issue the command to create the partition to 
> the followers.
> 2015/02/26 21:11:50.413 INFO [LogManager] [kafka-request-handler-12] 
> [kafka-server] [] Created log for partition [topicA,30] in 
> /path_to/i001_caches with properties {segment.index.bytes -> 10485760, 
> file.delete.delay.ms -> 60000, segment.bytes -> 268435456, flush.ms -> 10000, 
> delete.retention.ms -> 86400000, index.interval.bytes -> 4096, 
> retention.bytes -> -1, min.insync.replicas -> 1, cleanup.policy -> delete, 
> unclean.leader.election.enable -> true, segment.ms -> 43200000, 
> max.message.bytes -> 1000000, flush.messages -> 20000, 
> min.cleanable.dirty.ratio -> 0.5, retention.ms -> 86400000, segment.jitter.ms 
> -> 0}.
> 2015/02/26 21:11:50.418 WARN [Partition] [kafka-request-handler-12] 
> [kafka-server] [] Partition [topicA,30] on broker 1551: No checkpointed 
> highwatermark is found for partition [topicA,30]
> 2015/02/26 21:11:50.418 INFO [ReplicaFetcherManager] 
> [kafka-request-handler-12] [kafka-server] [] [ReplicaFetcherManager on broker 
> 1551] Removed fetcher for partitions [topicA,30]
> 2015/02/26 21:11:50.418 INFO [Log] [kafka-request-handler-12] [kafka-server] 
> [] Truncating log topicA-30 to offset 0.
> 2015/02/26 21:11:50.450 INFO [ReplicaFetcherManager] 
> [kafka-request-handler-12] [kafka-server] [] [ReplicaFetcherManager on broker 
> 1551] Added fetcher for partitions List([[topicA,30], initOffset 0 to broker 
> id:1555,host:host1555.example.com,port:10251] )
> 2015/02/26 21:11:50.615 ERROR [ReplicaFetcherThread] 
> [ReplicaFetcherThread-0-1555] [kafka-server] [] 
> [ReplicaFetcherThread-0-1555], Error for partition [topicA,30] to broker 
> 1555:class kafka.common.UnknownTopicOrPartitionException
> 2015/02/26 21:11:50.616 ERROR [ReplicaFetcherThread] 
> [ReplicaFetcherThread-0-1555] [kafka-server] [] 
> [ReplicaFetcherThread-0-1555], Error for partition [topicA,30] to broker 
> 1555:class kafka.common.UnknownTopicOrPartitionException
> 2015/02/26 21:11:50.618 ERROR [ReplicaFetcherThread] 
> [ReplicaFetcherThread-0-1555] [kafka-server] [] 
> [ReplicaFetcherThread-0-1555], Error for partition [topicA,30] to broker 
> 1555:class kafka.common.UnknownTopicOrPartitionException
> 2015/02/26 21:11:50.620 ERROR [ReplicaFetcherThread] 
> [ReplicaFetcherThread-0-1555] [kafka-server] [] 
> [ReplicaFetcherThread-0-1555], Error for partition [topicA,30] to broker 
> 1555:class kafka.common.UnknownTopicOrPartitionException
> 2015/02/26 21:11:50.621 ERROR [ReplicaFetcherThread] 
> [ReplicaFetcherThread-0-1555] [kafka-server] [] 
> [ReplicaFetcherThread-0-1555], Error for partition [topicA,30] to broker 
> 1555:class kafka.common.UnknownTopicOrPartitionException
> 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to