Dong Lin created KAFKA-7019:
-------------------------------

             Summary: Reduction the contention between metadata update and 
metadata read operation
                 Key: KAFKA-7019
                 URL: https://issues.apache.org/jira/browse/KAFKA-7019
             Project: Kafka
          Issue Type: Improvement
            Reporter: Dong Lin
            Assignee: Radai Rosenblatt


Currently MetadataCache.updateCache() grabs a write lock in order to process 
the UpdateMetadataRequest from controller. And a read lock is needed in order 
to handle the MetadataRequest from clients. Thus the handling of 
MetadataRequest and UpdateMetadataRequest blocks each other and the broker can 
only process such request at a time even if there are multiple request handler 
threads. Note that broker can not process MetadataRequest in parallel if there 
is a UpdateMetadataRequest waiting for the write lock, even if MetadataRequest 
only requires the read lock to e processed.

For large cluster which has tens of thousands of partitions, it can take e.g. 
200 ms to process UpdateMetadataRequest and MetadataRequest from large clients 
(e.g. MM). During the period when user is rebalancinng cluster, the leadership 
change will cause both UpdateMetadataRequest from controller and also 
MetadataRequest from client. If a broker receives 10 MetadataRequest per second 
and 2 UpdateMetadataRequest per second on average, since these requests need to 
be processed one-at-a-time, it can reduce the request handler thread idle ratio 
to 0 which makes this broker unavailable to user.

We can address this problem by removing the read/write lock in MetadataCache. 
The idea is that MetadataCache.updateCache() can instantiate a new copy of the 
cache as method local variable when it is processing the UpdateMetadataRequest 
and replace the class private varaible with newly instantiated method local 
varaible at the end of MetadataCache.updateCache(). All these can be done 
without grabbing any lock. The handling of MetadataRequest only requires access 
to the read-only class-private variable.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to