[ 
https://issues.apache.org/jira/browse/THRIFT-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030486#comment-14030486
 ] 

Grzegorz Leszczyński edited comment on THRIFT-2521 at 6/13/14 10:56 AM:
------------------------------------------------------------------------

Sorry, but I will not think about this test, because I'm not sure if it is 
possible and moreover I don't have time for this. We applied this patch in our 
company and I just want to help and share it with you. It's up to you, what 
happens with this patch :-)

I can say in two words what was the main problem (except some missing locks and 
unused params and variables). When we are stopping ThreadManager all workers 
need to stop. When last of them decreases workerCount_ to 0 in 
ThreadManager::Worker::run and calls workerMonitor_.notify(), then 
ThreadManager::Impl::removeWorker called from ThreadManager::Impl::stopImpl 
passes through "while (workerCount_ != workerMaxCount_) { 
workerMonitor_.wait(); }" and ThreadManager gets destroyed. But at this moment 
other workers can be in ThreadManager::Worker::run somewhere between "manager_ 
-> workerCount_ --;" and "Synchronized s(manager_->workerMonitor_);" and when 
they get processor they cause crash, because ThreadManager is already destroyed.



was (Author: gleszczynski):
Sorry, but I will not think about this test, because I'm not sure if it is 
possible and moreover I don't have time for this. We applied this patch in our 
company and I just want to help and share it with you. It's up to you, what 
happens with this patch :-)

I can say in two words what was the main problem (except some missing locks and 
unused params and variables). When we are stopping ThreadManager all workers 
need to stop. When last of them decreases workerCount_ to 0 in 
ThreadManager::Worker::run and calls workerMonitor_.notify(), then 
ThreadManager::Impl::removeWorker called from ThreadManager::Impl::stopImpl 
passes through "while (workerCount_ != workerMaxCount_) { 
workerMonitor_.wait(); }" and ThreadManager gets destroyed. But at this moment 
other workers can be in ThreadManager::Worker::run somewhere between 
"manager_->workerCount_--;" and "Synchronized s(manager_->workerMonitor_);" and 
when they get processor they cause crash, because ThreadManager is already 
destroyed.


> Fixed synchronisation in ThreadManager.cpp
> ------------------------------------------
>
>                 Key: THRIFT-2521
>                 URL: https://issues.apache.org/jira/browse/THRIFT-2521
>             Project: Thrift
>          Issue Type: Bug
>          Components: C++ - Library
>    Affects Versions: 0.9.2, 1.0
>            Reporter: Grzegorz Leszczyński
>              Labels: patch
>             Fix For: 0.9.2, 1.0
>
>         Attachments: thrift-2521-ThreadManager.patch
>
>
> Server can crash, when stop is called. Fixes also other minor synchronisation 
> problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to