anshul35 opened a new pull request, #17492:
URL: https://github.com/apache/kafka/pull/17492

   Issue Details: Inside TopicBasedRemoteLogMetadataManager::close, one 
thread(t1) is calling join on initializationThread thread after taking 
writeLock on "lock" object => t1 will wait for initializationThread to 
complete. Internally initializationThread is also using writeLock on "lock" 
object. This can cause deadlock in below situation
   
   1. initializationThread is started
   2. close has been invoked as part of a separate thread. But this thread is 
not yet scheduled by OS.
   3. At line 430, initializationThread is preempted and OS has started running 
close thread. close takes writeLock and invoked join on initializationThread.
   4. Now OS schedules initializationThread again and at line 433 this thread 
also tries to take writeLock. But since writeLock is already held by close 
thread => both are waiting on each other to complete. initializationThread will 
wait on close to release the writeLock, while close thread will wait for 
completion of initializationThread
   
   Fix Details: Ideally before even close starts its processing, it should do 
so if either initialization has not yet started or it has completed. Similarly, 
initialization thread should not start any processing if another thread has 
invoked invoke. This can be achieved by using writeLock() before even starting 
the close or initialization. One should happen before another.
   
   Case 1 : close() is invoked after initialization thread is complete. In this 
case, we can close all the resources and done with the close() method 
invocation. 
   
   Case 2 : close() is invoked while initialization thread is running. In this 
case, thread invoking close() method will wait to get the writeLock i.e. until 
initialization thread is complete. 
   
   Case 3 : close() is invoked before initializationThread starts. In this 
case, we will set closing to true and done with the close() method invocation. 
When initialization starts, it will acquire the writeLock and after that it 
will read closing instance variable. Based on that it won't enter while loop 
and simply exit.
   
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to