ExceptionTest.AbortOnFrameworkError sometimes hangs if Mesos built without 
optimizations
----------------------------------------------------------------------------------------

                 Key: MESOS-77
                 URL: https://issues.apache.org/jira/browse/MESOS-77
             Project: Mesos
          Issue Type: Bug
          Components: test
         Environment: Linux, GCC 4.3.2, configuring with CXXFLAGS='-fPIC -g3 
-O0' CFLAGS='-fPIC -g3 -O0'


            Reporter: Charles Reiss
            Priority: Minor
         Attachments: backtrace.txt

make test hangs at "[ RUN      ] ExceptionTest.AbortOnFrameworkError" 
consistently on my setup (on the R cluster at Berkeley). I think there is a 
race in the underlying issue, so I wouldn't be surprised if it were hard to 
reproduce consistently elsewhere.

After debugging, the problem appears to be that the 
mesos::internal::SchedulerProcess is terminated using process::terminate(). 
However, before ~MesosSchedulerDriver is called, the process is revivified by 
the delay() handler in SchedulerProcess::doReliableRegistration(), causing the 
wait() in ~MesosSchedulerDriver to wait indefinitely.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to