chia7712 commented on pull request #6915:
URL: https://github.com/apache/kafka/pull/6915#issuecomment-626119452


   @junrao thanks for the great explanation. It is indeed a long story of lock 
improvement in kafka :)
   
   > After it appends to the local log, it may call 
ReplicaManager.tryCompleteDelayedProduce(),
   
   just double check. the code ```ReplicaManager.tryCompleteDelayedProduce``` 
is nonexistent in trunk branch and the replacement is 
```Partition#tryCompleteDelayedRequests```, right? 
(https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/cluster/Partition.scala#L850)
   ```scala
     /**
      * Try to complete any pending requests. This should be called without 
holding the leaderIsrUpdateLock.
      */
     private def tryCompleteDelayedRequests(): Unit = 
delayedOperations.checkAndCompleteAll()
   ```
   
   >  which may need to hold a different group lock (since the key of the 
operation is a topic partition on which many groups can reside), which can 
cause a deadlock.
   
   It seems we should introduce a check that you have to release all group lock 
before completing topic partition level of delayed produce.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to