ywkaras commented on code in PR #11184: URL: https://github.com/apache/trafficserver/pull/11184#discussion_r1561791523
########## src/tsutil/DbgCtl.cc: ########## @@ -150,14 +150,28 @@ DbgCtl::_new_reference(char const *tag) DebugInterface *p = DebugInterface::get_instance(); debug_assert(tag != nullptr); - // DbgCtl instances may be declared as static objects in the destructors of objects not destoyed till program exit. + // DbgCtl instances may be declared as static objects in the destructors of objects not destroyed till program exit. // So, we must handle the case where the construction of such instances of DbgCtl overlaps with the destruction of // other instances of DbgCtl. That is why it is important to make sure the reference count is non-zero before // constructing _RegistryAccessor. The _RegistryAccessor constructor is thereby able to assume that, if it creates // the Registry, the new Registry will not be destroyed before the mutex in the new Registry is locked. ++_RegistryAccessor::registry_reference_count; + // There is a mutex in the C/C++ runtime that both dlopen() and _cxa_thread_atexit() lock while running. + // Creating a _RegistryAccessor instance locks the registry mutex. If the subsequent code in this function triggers + // the construction of a thread_local variable (with a non-trivial destructor), the following deadlock scenario is + // possible: + // 1. Thread 1 calls a DbgCtl constructor, which locks the registry mutex, but then is suspended. + // 2. Thread 2 calls dlopen() for a plugin, locking the runtime mutex. It then executes the constructor for a + // statically allocated DbgCtl object, which blocks on locking the registry mutex. + // 3. Thread 1 resumes, and calls member functions of the derived class of DebugInterface. If this causes the + // the construction of a thread_local variable with a non-trivial destructor, _cxa_thread_atexit() will be called + // to set up a call of the variable's destructor at thread exit. The call to _cxa_thread_atexit() will block on + // the runtime mutex (held by Thread 2). So Thread 1 holds the registry mutex and is blocked waiting for the + // runtime mutex. And Thread 2 holds the runtime mutex and is blocked waiting for the registry mutex. Deadlock. + // + // This deadlock is avoided by having the thread_local variable register its destruction in a non-thread_local class. Review Comment: ??? I thought this was already present. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@trafficserver.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org