Kyle Ambroff-Kao created KAFKA-6468:
---------------------------------------

             Summary: Replication high watermark checkpoint file read for every 
LeaderAndIsrRequest
                 Key: KAFKA-6468
                 URL: https://issues.apache.org/jira/browse/KAFKA-6468
             Project: Kafka
          Issue Type: Bug
            Reporter: Kyle Ambroff-Kao


The high watermark for each partition in a given log directory is written to 
disk every _replica.high.watermark.checkpoint.interval.ms_ milliseconds. This 
checkpoint file is used to create replicas when joining the cluster.

[https://github.com/apache/kafka/blob/b73c765d7e172de4742a3aa023d5a0a4b7387247/core/src/main/scala/kafka/cluster/Partition.scala#L180]

Unfortunately this file is read every time 
kafka.cluster.Partition#getOrCreateReplica is invoked. For most clusters this 
isn't a big deal, but for a small cluster with lots of partitions all of the 
reads of this file really add up.

On my local test cluster of three brokers with around 40k partitions, the 
initial LeaderAndIsrRequest refers to every partition in the cluster, and it 
can take 20 to 30 minutes to create all of the replicas because the 
_replication-offset-checkpoint_ is nearly 2MB.

Changing this code so that we only read this file once on startup reduces the 
time to create all replicas to around one minute.

Credit to [~onurkaraman] for finding this one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to