Clifford Jansen created PROTON-2483:
---------------------------------------

             Summary: TSAN reported potential deadlock in epoll proactor when 
run via Qpid Dispatch router.
                 Key: PROTON-2483
                 URL: https://issues.apache.org/jira/browse/PROTON-2483
             Project: Qpid Proton
          Issue Type: Bug
          Components: proton-c
    Affects Versions: proton-c-0.36.0
         Environment: linux epoll
            Reporter: Clifford Jansen
            Assignee: Clifford Jansen


The traces are incomplete but the 4 way thread tangle can be inferred as 
follows:

  A: pn_proactor_set_timeout()   (p->task.mutex + tm->task.mutex)
  B: pni_timer_manager_process() (tm->task.mutex + tm->deletion_mutex)
  C: pni_connection_timeout()    (tm->deletion_mutex + pc1->task.mutex)
  D: proactor_remove()           (pc1->task.mutex + p->task.mutex)

While this particular trace is a false positive (D occurs after all other 
threads have been joined and there are no competing threads to complete the 
circle), the lock ordering is clearly asking for eventual trouble.

The proactor set_timeout and cancel_timeout API calls do not need to hold the 
proactor task lock while interacting with the timer manager, but do so as a 
convenience to prevent collisions between simultaneous sets/cancels.  A 
separate lock can achieve that purpose, stopping A from participating in the 
potential deadlock.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to