ywkaras commented on code in PR #11184:
URL: https://github.com/apache/trafficserver/pull/11184#discussion_r1561791523


##########
src/tsutil/DbgCtl.cc:
##########
@@ -150,14 +150,28 @@ DbgCtl::_new_reference(char const *tag)
   DebugInterface *p = DebugInterface::get_instance();
   debug_assert(tag != nullptr);
 
-  // DbgCtl instances may be declared as static objects in the destructors of 
objects not destoyed till program exit.
+  // DbgCtl instances may be declared as static objects in the destructors of 
objects not destroyed till program exit.
   // So, we must handle the case where the construction of such instances of 
DbgCtl overlaps with the destruction of
   // other instances of DbgCtl.  That is why it is important to make sure the 
reference count is non-zero before
   // constructing _RegistryAccessor.  The _RegistryAccessor constructor is 
thereby able to assume that, if it creates
   // the Registry, the new Registry will not be destroyed before the mutex in 
the new Registry is locked.
 
   ++_RegistryAccessor::registry_reference_count;
 
+  // There is a mutex in the C/C++ runtime that both dlopen() and 
_cxa_thread_atexit() lock while running.
+  // Creating a _RegistryAccessor instance locks the registry mutex.  If the 
subsequent code in this function triggers
+  // the construction of a thread_local variable (with a non-trivial 
destructor), the following deadlock scenario is
+  // possible:
+  // 1.  Thread 1 calls a DbgCtl constructor, which locks the registry mutex, 
but then is suspended.
+  // 2.  Thread 2 calls dlopen() for a plugin, locking the runtime mutex.  It 
then executes the constructor for a
+  //     statically allocated DbgCtl object, which blocks on locking the 
registry mutex.
+  // 3.  Thread 1 resumes, and calls member functions of the derived class of 
DebugInterface.  If this causes the
+  //     the construction of a thread_local variable with a non-trivial 
destructor, _cxa_thread_atexit() will be called
+  //     to set up a call of the variable's destructor at thread exit.  The 
call to _cxa_thread_atexit() will block on
+  //     the runtime mutex (held by Thread 2).  So Thread 1 holds the registry 
mutex and is blocked waiting for the
+  //     runtime mutex.  And Thread 2 holds the runtime mutex and is blocked 
waiting for the registry mutex.  Deadlock.
+  //
+  // This deadlock is avoided by having the thread_local variable register its 
destruction in a non-thread_local class.

Review Comment:
   ??? I thought this was already present.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to