[ 
https://issues.apache.org/jira/browse/THRIFT-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Buğra Gedik updated THRIFT-3932:
--------------------------------
    Description: 
{{ThreadManger::join}} calls {{stopImpl(true)}}, which in turn calls 
{{removeWorker(workerCount_);}}. The latter waits until {{while (workerCount_ 
!= workerMaxCount_)}}. Within the {{run}} method of the workers, the last 
thread that detects {{workerCount_ == workerMaxCount_}} notifies 
{{removeWorker}}. The {{run}} method has the following additional code that is 
executed at the very end:

{code}
    {
      Synchronized s(manager_->workerMonitor_);
      manager_->deadWorkers_.insert(this->thread());
      if (notifyManager) {
        manager_->workerMonitor_.notify();
      }
    }
{code}

This is an independent synchronized block. Now assume 2 threads. One of them 
has {{notifyManager=true}} as it detected the {{workerCount_ == 
workerMaxCount_}} condition earlier. It is possible that this thread gets to 
execute  the above code first, and the ThreadManager's {{removeWorker}} method 
unblocks and eventually the ThreadManager's {{join}} returns and the object is 
destructed. When the other thread reaches the synchronized block above, it will 
crash, as the manager is not around anymore.

Besides, {{ThreadManager}} never joins its threads.

Attached is a small fix that solves these problems.

  was:
{{ThreadManger::join}} calls {{stopImpl(true)}}, which in turn calls 
{{removeWorker(workerCount_);}}. The latter waits until {{while (workerCount_ 
!= workerMaxCount_)}}. Within the {{run}} method of the workers, the last 
thread that detects {{workerCount_ == workerMaxCount_}} notifies 
{{removeWorker}}. The {{run}} method has the following additional code that is 
executed at the very end:

{code}
    {
      Synchronized s(manager_->workerMonitor_);
      manager_->deadWorkers_.insert(this->thread());
      if (notifyManager) {
        manager_->workerMonitor_.notify();
      }
    }
{code}

This is an independent synchronized block. Now assume 2 threads. One of them 
has {{notifyManager=true}} as it detected the {{workerCount_ == 
workerMaxCount_}} condition earlier. It is possible that this thread gets to 
execute  the above code first, and the ThreadManager's {{removeWorker}} method 
unblocks and eventually the ThreadManager's {{join}} returns and the object is 
destructed. When the other thread reaches the synchronized block above, it will 
crash, as the manager is not around anymore.

Besides, the ThreadManager never joins its threads.

Attached is a small fix that solves these problems.


> C++ ThreadManager has a rare termination race
> ---------------------------------------------
>
>                 Key: THRIFT-3932
>                 URL: https://issues.apache.org/jira/browse/THRIFT-3932
>             Project: Thrift
>          Issue Type: Bug
>            Reporter: Buğra Gedik
>         Attachments: thrift-patch
>
>
> {{ThreadManger::join}} calls {{stopImpl(true)}}, which in turn calls 
> {{removeWorker(workerCount_);}}. The latter waits until {{while (workerCount_ 
> != workerMaxCount_)}}. Within the {{run}} method of the workers, the last 
> thread that detects {{workerCount_ == workerMaxCount_}} notifies 
> {{removeWorker}}. The {{run}} method has the following additional code that 
> is executed at the very end:
> {code}
>     {
>       Synchronized s(manager_->workerMonitor_);
>       manager_->deadWorkers_.insert(this->thread());
>       if (notifyManager) {
>         manager_->workerMonitor_.notify();
>       }
>     }
> {code}
> This is an independent synchronized block. Now assume 2 threads. One of them 
> has {{notifyManager=true}} as it detected the {{workerCount_ == 
> workerMaxCount_}} condition earlier. It is possible that this thread gets to 
> execute  the above code first, and the ThreadManager's {{removeWorker}} 
> method unblocks and eventually the ThreadManager's {{join}} returns and the 
> object is destructed. When the other thread reaches the synchronized block 
> above, it will crash, as the manager is not around anymore.
> Besides, {{ThreadManager}} never joins its threads.
> Attached is a small fix that solves these problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to