Github user tpcwang commented on a diff in the pull request:

    https://github.com/apache/thrift/pull/980#discussion_r58814042
  
    --- Diff: lib/cpp/src/thrift/server/TThreadedServer.cpp ---
    @@ -92,29 +97,42 @@ TThreadedServer::~TThreadedServer() {
     void TThreadedServer::serve() {
       TServerFramework::serve();
     
    -  // Drain all clients - no more will arrive
    -  try {
    -    Synchronized s(clientsMonitor_);
    -    while (getConcurrentClientCount() > 0) {
    -      clientsMonitor_.wait();
    -    }
    -  } catch (TException& tx) {
    -    string errStr = string("TThreadedServer: Exception joining workers: ") 
+ tx.what();
    -    GlobalOutput(errStr.c_str());
    +  // Ensure post-condition of no active clients
    +  Synchronized s(clientMonitor_);
    +  while (!clientMap_.empty()) {
    +    clientMonitor_.wait();
       }
     }
     
     void TThreadedServer::onClientConnected(const 
shared_ptr<TConnectedClient>& pClient) {
    -  threadFactory_->newThread(pClient)->start();
    +  Synchronized sync(clientMonitor_);
    +  clientMap_.insert(ClientMap::value_type(pClient.get(), 
boost::make_shared<TConnectedClientTracker>(pClient)));
    +
    +  // We do not track the threads themselves
    +  ClientMap::const_iterator it = clientMap_.find(pClient.get());
    +  threadFactory_->newThread(it->second)->start();
     }
     
     void TThreadedServer::onClientDisconnected(TConnectedClient* pClient) {
    -  THRIFT_UNUSED_VARIABLE(pClient);
    -  Synchronized s(clientsMonitor_);
    -  if (getConcurrentClientCount() == 0) {
    -    clientsMonitor_.notify();
    +  Synchronized sync(clientMonitor_);
    +  clientMap_.erase(pClient);
    +  if (clientMap_.empty()) {
    --- End diff --
    
    This doesn't fix the issue because as soon as you notify clientMonitor_, 
TThreadedServer::serve() will exit and clients are free to release 
TThreadedServer, but the thread that initiated onClientDisconnected is still in 
the middle of disposeConnectedClient, which it should not be executing any more 
because the object is already destroyed.
    
    Try adding one second sleep after the delete in disposeConnectedClient and 
run the tests. I see the same segfault in TServerIntegrationTest. I am not sure 
we can get away with not joining the threads.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to