[jira] [Updated] (THRIFT-3932) C++ ThreadManager has a rare termination race

JIRA Tue, 20 Sep 2016 01:02:51 -0700

     [ 
https://issues.apache.org/jira/browse/THRIFT-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Buğra Gedik updated THRIFT-3932:
--------------------------------
    Description: 
{{ThreadManger::join}} calls {{stopImpl(true)}}, which in term calls 
{{removeWorker(workerCount_);}}. The latter waits until {{while (workerCount_ 
!= workerMaxCount_)}}. Within the {{run}} method of the workers, the last 
thread that detects {{workerCount_ == workerMaxCount_}} notifies 
{{removeWorker}}. The {{run}} method has the following additional code that is 
executed at the very end:

{code}
    {
      Synchronized s(manager_->workerMonitor_);
      manager_->deadWorkers_.insert(this->thread());
      if (notifyManager) {
        manager_->workerMonitor_.notify();
      }
    }
{code}

This is an independent synchronized block. Now assume 2 threads. One of them 
has {{notifyManager=true}} as it detected the {{workerCount_ == 
workerMaxCount_}} condition earlier. It is possible that this thread gets to 
execute  the above code first, and the ThreadManager's {{removeWorker}} method 
unblocks and eventually the ThreadManager's {{join}} returns and the object is 
destructed. When the other thread reaches the synchronized block above, it will 
crash, as the manager is not around anymore.

Besides, the ThreadManager never joins its threads.

Attached is a small fix that alleviates these problems.

  was:
{{ThreadManger::join}} calls {{stopImpl(true)}}, which in term calls 
{{removeWorker(workerCount_);}}. The latter waits until {{while (workerCount_ 
!= workerMaxCount_)}}. In the run method, the last thread that detects 
{{workerCount_ == workerMaxCount_}} notifies the {{removeWorker}} method. 
However, the run method has the following additional code that is executed at 
the very end:

{code}
{
      Synchronized s(manager_->workerMonitor_);
      manager_->deadWorkers_.insert(this->thread());
      if (notifyManager) {
        manager_->workerMonitor_.notify();
      }
    }
{code}

This is an independent synchronized block. Now assume 2 threads. One of them 
has {{notifyManager=true}} as it detected the {{workerCount_ == 
workerMaxCount_}} condition earlier. It is possible that this thread gets to 
execute  the above code first, and the ThreadManager's {{removeWorker}} method 
unblocks and eventually the ThreadManager's {{join}} returns and the object 
destructed. When the other thread reaches the synchronized block above, it will 
crash, as the manager is not around anymore.

Besides, the ThreadManager never joins its threads.

Attached is a small fix that alleviates the problem.


> C++ ThreadManager has a rare termination race
> ---------------------------------------------
>
>                 Key: THRIFT-3932
>                 URL: https://issues.apache.org/jira/browse/THRIFT-3932
>             Project: Thrift
>          Issue Type: Bug
>            Reporter: Buğra Gedik
>         Attachments: thrift-patch
>
>
> {{ThreadManger::join}} calls {{stopImpl(true)}}, which in term calls 
> {{removeWorker(workerCount_);}}. The latter waits until {{while (workerCount_ 
> != workerMaxCount_)}}. Within the {{run}} method of the workers, the last 
> thread that detects {{workerCount_ == workerMaxCount_}} notifies 
> {{removeWorker}}. The {{run}} method has the following additional code that 
> is executed at the very end:
> {code}
>     {
>       Synchronized s(manager_->workerMonitor_);
>       manager_->deadWorkers_.insert(this->thread());
>       if (notifyManager) {
>         manager_->workerMonitor_.notify();
>       }
>     }
> {code}
> This is an independent synchronized block. Now assume 2 threads. One of them 
> has {{notifyManager=true}} as it detected the {{workerCount_ == 
> workerMaxCount_}} condition earlier. It is possible that this thread gets to 
> execute  the above code first, and the ThreadManager's {{removeWorker}} 
> method unblocks and eventually the ThreadManager's {{join}} returns and the 
> object is destructed. When the other thread reaches the synchronized block 
> above, it will crash, as the manager is not around anymore.
> Besides, the ThreadManager never joins its threads.
> Attached is a small fix that alleviates these problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (THRIFT-3932) C++ ThreadManager has a rare termination race

Reply via email to