ExceptionTest.AbortOnFrameworkError sometimes hangs if Mesos built without
optimizations
----------------------------------------------------------------------------------------
Key: MESOS-77
URL: https://issues.apache.org/jira/browse/MESOS-77
Project: Mesos
Issue Type: Bug
Components: test
Environment: Linux, GCC 4.3.2, configuring with CXXFLAGS='-fPIC -g3
-O0' CFLAGS='-fPIC -g3 -O0'
Reporter: Charles Reiss
Priority: Minor
Attachments: backtrace.txt
make test hangs at "[ RUN ] ExceptionTest.AbortOnFrameworkError"
consistently on my setup (on the R cluster at Berkeley). I think there is a
race in the underlying issue, so I wouldn't be surprised if it were hard to
reproduce consistently elsewhere.
After debugging, the problem appears to be that the
mesos::internal::SchedulerProcess is terminated using process::terminate().
However, before ~MesosSchedulerDriver is called, the process is revivified by
the delay() handler in SchedulerProcess::doReliableRegistration(), causing the
wait() in ~MesosSchedulerDriver to wait indefinitely.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira