bneradt opened a new issue, #10129:
URL: https://github.com/apache/trafficserver/issues/10129

   I reproduced a deadlock between DbgCtl and PCRE. I'm attaching the full 
backtrace across the threads.
   
[TSVConnFd_deadlock.txt](https://github.com/apache/trafficserver/files/12233154/TSVConnFd_deadlock.txt)
   
   The relevant threads seem to be the following:
   
   ### Holds DbgCtl _mtx, awaiting PCRE lock
   
   ```
     Thread 10 (Thread 0x7ffbf52646c0 (LWP 55868) "[LOG_FLUSH]"):
     #0  futex_wait (private=0, expected=2, futex_word=0x7ffbf73a3a08 
<_rtld_local+2568>) at ../sysdeps/nptl/futex-internal.h:146
     #1  __GI___lll_lock_wait (futex=futex@entry=0x7ffbf73a3a08 
<_rtld_local+2568>, private=0) at lowlevellock.c:49
     #2  0x00007ffbf65b0e77 in lll_mutex_lock_optimized (mutex=0x7ffbf73a3a08 
<_rtld_local+2568>) at pthread_mutex_lock.c:48
     #3  ___pthread_mutex_lock (mutex=mutex@entry=0x7ffbf73a3a08 
<_rtld_local+2568>) at pthread_mutex_lock.c:128
     #4  0x00007ffbf6560daa in __cxa_thread_atexit_impl (func=<optimized out>, 
obj=0x7ffbf52646a0, dso_symbol=0xd48008) at cxa_thread_atexit_impl.c:117
     #5  0x00007ffbf68b95dd in __cxxabiv1::__cxa_thread_atexit (dtor=<optimized 
out>, obj=<optimized out>, dso_handle=<optimized out>) at 
../../../../libstdc++-v3/libsupc++/atexit_thread.cc:69
     #6  0x0000000000d15a20 in get_jit_stack (data=0x0) at Regex.cc:43          
                            
     #7  0x00007ffbf7121965 in _pcre_jit_exec (offset_count=2, 
offsets=0x7ffbf5262f10, options=0, start_offset=0, length=12, subject=0xdb58d7 
"log-periodic", extra_data=0x2a44150) at /usr/src/debug/pcre-8.45-1.fc3
     8.3.x86_64/pcre_jit_compile.c:11650                                        
                            
     #8  pcre_exec (argument_re=0x2a440e0, extra_data=0x2a44150, 
subject=<optimized out>, length=12, start_offset=0, options=<optimized out>, 
offsets=0x7ffbf5262f10, offsetcount=30) at /usr/src/debug/pcre-8.45-1.f
     c38.3.x86_64/pcre_exec.c:6489                                              
                                                
     #9  0x0000000000d15c64 in Regex::exec (this=0x2a44300, str="log-periodic", 
ovector=0x7ffbf5262f10, ovecsize=30) at Regex.cc:119                            
              
     #10 0x0000000000d15bfe in Regex::exec (this=0x2a44300, str="log-periodic") 
at Regex.cc:111                               
     #11 0x0000000000d15fae in DFA::match (this=0x2a440c0, str="log-periodic") 
at Regex.cc:188
     #12 0x0000000000ceb76d in Diags::tag_activated (this=0x2933e00, 
tag=0xdb58d7 "log-periodic", mode=DiagsTagType_Debug) at Diags.cc:379
     #13 0x0000000000ce9632 in DbgCtl::_new_reference (tag=0xdb58d7 
"log-periodic") at DbgCtl.cc:158
     #14 0x000000000085b76c in DbgCtl::DbgCtl (this=0x1001cf0 
<Log::periodic_tasks(long)::Debug_ctl>, tag=0xdb58d7 "log-periodic") at 
../../include/tscore/DbgCtl.h:36
     #15 0x0000000000a63984 in Log::periodic_tasks (time_now=1690907172) at 
Log.cc:224  
     #16 0x0000000000a6ed19 in Log::flush_thread_main () at Log.cc:1415
     #17 0x0000000000a6f95d in LoggingFlushContinuation::mainEvent 
(this=0x3294f80) at Log.cc:284
     #18 0x0000000000863ac9 in Continuation::handleEvent (this=0x3294f80, 
event=1, data=0x3294fe0) at 
/var/tmp/trafficserver/iocore/eventsystem/I_Continuation.h:228
     #19 0x0000000000cdf621 in EThread::execute (this=0x3295050) at 
UnixEThread.cc:318                                                              
                                                        
     #20 0x0000000000cddf66 in spawn_thread_internal (a=0x33a2040) at 
Thread.cc:79                                              
     #21 0x00007ffbf65ad907 in start_thread (arg=<optimized out>) at 
pthread_create.c:444                                                            
                         
     #22 0x00007ffbf6633774 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:100                                   
   
   ```
   
   ### Threads waiting upon DbgCtl _mtx
   
   ```
     Thread 4 (Thread 0x7ffbf5a6a6c0 (LWP 55862) "[ET_NET 1]"):                 
                                            
     #0  futex_wait (private=0, expected=2, futex_word=0x28c29b0) at 
../sysdeps/nptl/futex-internal.h:146       
     #1  __GI___lll_lock_wait (futex=futex@entry=0x28c29b0, private=0) at 
lowlevellock.c:49                                                            
     #2  0x00007ffbf65b0e21 in lll_mutex_lock_optimized (mutex=0x28c29b0) at 
pthread_mutex_lock.c:48                                                         
                                   
     #3  ___pthread_mutex_lock (mutex=0x28c29b0) at pthread_mutex_lock.c:93     
                            
     #4  0x000000000086e5e9 in __gthread_mutex_lock (__mutex=0x28c29b0) at 
/usr/include/c++/13/x86_64-redhat-linux/bits/gthr-default.h:749
     #5  0x000000000086e868 in std::mutex::lock (this=0x28c29b0) at 
/usr/include/c++/13/bits/std_mutex.h:113
     #6  0x0000000000ce9aad in DbgCtl::_RegistryAccessor::_RegistryAccessor 
(this=0x7ffbf5a68fcf) at DbgCtl.cc:84
     #7  0x0000000000ce953f in DbgCtl::_new_reference (tag=0xe11791 
"inactivity_cop_check") at DbgCtl.cc:141                    
     #8  0x000000000085b76c in DbgCtl::DbgCtl (this=0x112b040 
<InactivityCop::check_inactivity(int, Event*)::Debug_ctl>, tag=0xe11791 
"inactivity_cop_check") at ../../include/tscore/DbgCtl.h:36
     #9  0x0000000000c6a722 in InactivityCop::check_inactivity 
(this=0x7ffbec0a0ca0, event=2, e=0x7ffbec0a0d30) at UnixNet.cc:78
     #10 0x0000000000863ac9 in Continuation::handleEvent (this=0x7ffbec0a0ca0, 
event=2, data=0x7ffbec0a0d30) at 
/var/tmp/trafficserver/iocore/eventsystem/I_Continuation.h:228
     #11 0x0000000000cded67 in EThread::process_event (this=0x2bf1ab0, 
e=0x7ffbec0a0d30, calling_code=2) at UnixEThread.cc:149            
     #12 0x0000000000cdf350 in EThread::execute_regular (this=0x2bf1ab0) at 
UnixEThread.cc:255      
     #13 0x0000000000cdf69b in EThread::execute (this=0x2bf1ab0) at 
UnixEThread.cc:334                                                              
                  
     #14 0x0000000000cddf66 in spawn_thread_internal (a=0x2ab0d10) at 
Thread.cc:79      
     #15 0x00007ffbf65ad907 in start_thread (arg=<optimized out>) at 
pthread_create.c:444
     #16 0x00007ffbf6633774 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:100         
   ```
   
   ```
     Thread 1 (Thread 0x7ffbf651d340 (LWP 55858) "traffic_server"):             
                                                                                
     
     #0  futex_wait (private=0, expected=2, futex_word=0x28c29b0) at 
../sysdeps/nptl/futex-internal.h:146        
     #1  __GI___lll_lock_wait (futex=futex@entry=0x28c29b0, private=0) at 
lowlevellock.c:49
     #2  0x00007ffbf65b0e21 in lll_mutex_lock_optimized (mutex=0x28c29b0) at 
pthread_mutex_lock.c:48                                      
     #3  ___pthread_mutex_lock (mutex=0x28c29b0) at pthread_mutex_lock.c:93     
                                                                                
                              
     #4  0x000000000086e5e9 in __gthread_mutex_lock (__mutex=0x28c29b0) at 
/usr/include/c++/13/x86_64-redhat-linux/bits/gthr-default.h:749 
     #5  0x000000000086e868 in std::mutex::lock (this=0x28c29b0) at 
/usr/include/c++/13/bits/std_mutex.h:113
     #6  0x0000000000ce9aad in DbgCtl::_RegistryAccessor::_RegistryAccessor 
(this=0x7fff091f43bf) at DbgCtl.cc:84
     #7  0x0000000000ce953f in DbgCtl::_new_reference (tag=0x7ffbf5163828 
"TSVConnFd") at DbgCtl.cc:141                        
     #8  0x000000000089ccb8 in TSDbgCtlCreate (tag=0x7ffbf5163828 "TSVConnFd") 
at traffic_server/InkAPI.cc:10136                                               
          
     #9  0x00007ffbf515b8d4 in __static_initialization_and_destruction_0 () at 
gold_tests/pluginTest/TSVConnFd/TSVConnFd.cc:449                                
         
     #10 0x00007ffbf515b9f2 in _GLOBAL__sub_I_TSVConnFd.cc(void) () at 
gold_tests/pluginTest/TSVConnFd/TSVConnFd.cc:864                                
                    
     #11 0x00007ffbf737517f in call_init (env=0x28c12b0, argv=0x7fff091f67d8, 
argc=2, l=<optimized out>) at dl-init.c:70                                      
  
     #12 call_init (l=<optimized out>, argc=2, argv=0x7fff091f67d8, 
env=0x28c12b0) at dl-init.c:26                                                  
                                          
     #13 0x00007ffbf737527d in _dl_init (main_map=0x33a2560, argc=2, 
argv=0x7fff091f67d8, env=0x28c12b0) at dl-init.c:117                            
                                    
     #14 0x00007ffbf73715c2 in __GI__dl_catch_exception 
(exception=exception@entry=0x0, operate=operate@entry=0x7ffbf737bea0 
<call_dl_init>, args=args@entry=0x7fff091f4610) at dl-catch.c:211               
      
     #15 0x00007ffbf737be3c in dl_open_worker (a=a@entry=0x7fff091f47c0) at 
dl-open.c:808                                                 
     #16 0x00007ffbf7371523 in __GI__dl_catch_exception 
(exception=exception@entry=0x7fff091f47a0, operate=operate@entry=0x7ffbf737bda0 
<dl_open_worker>, args=args@entry=0x7fff091f47c0) at dl-catch.c:237
     #17 0x00007ffbf737c1b4 in _dl_open (file=0x7fff091f4b70 
"/tmp/sb/TSVConnFd/ts/plugin/TSVConnFd.so", mode=<optimized out>, 
caller_dlopen=0xbcc915 <plugin_dso_load(char const*, void*&, void*&, 
std::__cxx11::bas
     ic_string<char, std::char_traits<char>, std::allocator<char> >&)+44>, 
nsid=<optimized out>, argc=2, argv=0x7fff091f67d8, env=0x28c12b0) at 
dl-open.c:884                  
     #18 0x00007ffbf65a96d4 in dlopen_doit (a=a@entry=0x7fff091f4a70) at 
dlopen.c:56                                                                     
                                   
     #19 0x00007ffbf7371523 in __GI__dl_catch_exception 
(exception=exception@entry=0x7fff091f49b0, operate=0x7ffbf65a9670 
<dlopen_doit>, args=0x7fff091f4a70) at dl-catch.c:237
     #20 0x00007ffbf7371679 in _dl_catch_error (objname=0x7fff091f4a18, 
errstring=0x7fff091f4a20, mallocedp=0x7fff091f4a17, operate=<optimized out>, 
args=<optimized out>) at dl-catch.c:256
     #21 0x00007ffbf65a91b3 in _dlerror_run 
(operate=operate@entry=0x7ffbf65a9670 <dlopen_doit>, 
args=args@entry=0x7fff091f4a70) at dlerror.c:138
     #22 0x00007ffbf65a978f in dlopen_implementation (dl_caller=<optimized 
out>, mode=<optimized out>, file=<optimized out>) at dlopen.c:71
     #23 ___dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:81  
                                                                                
               
     #24 0x0000000000bcc915 in plugin_dso_load (path=0x7fff091f4b70 
"/tmp/sb/TSVConnFd/ts/plugin/TSVConnFd.so", handle=@0x7fff091f4b48: 0xd1159e 
<Layout::relative[abi:cxx11](std::basic_string_view<char, std::char_
     traits<char> >)+80>, init=@0x7fff091f4b40: 0x0, error="") at Plugin.cc:105 
     #25 0x0000000000bcce9c in single_plugin_init (argc=3, argv=0x7fff091f5dd0, 
validateOnly=false) at Plugin.cc:153
     #26 0x0000000000bcdbe3 in plugin_init (validateOnly=false) at Plugin.cc:351
     #27 0x00000000008c3e07 in main (argv=0x7fff091f67d8) at 
traffic_server/traffic_server.cc:2160                                           
                  
   ```
   
   ### Analysis
   
   Typically, a deadlock like this would indicate that one of the two threads 
waiting upon the DbgCtl mtx releas holds the PCRE lock that thread 10 is 
awaiting upon. However I'm not sure how this would happen and I cannot find 
direct evidence of that in the other threads.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@trafficserver.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to