> On June 17, 2014, 12:28 a.m., Vinod Kone wrote: > > src/slave/slave.cpp, lines 1219-1224 > > <https://reviews.apache.org/r/22313/diff/7/?file=607037#file607037line1219> > > > > Is this possible? AFAICT, since the task was added to the executor the > > executor shouldn't be removed between _runTask() and __runTask(). Even if > > the executor terminates in between, this task should've been marked > > 'terminated' but not 'completed' (i.e., waiting for an ACK) and hence the > > executor won't be removed from the framework's map. Since there is a > > pending executor, the framework shouldn't be removed. > > > > So this can be a CHECK_NONTULL(framework) with a comment on why it can > > be a check. > > Yifan Gu wrote: > Good point! I am currently trying to add a test to kill the framework > before the containerizer->update() returns to test this. Thanks for pointing > out!
Hi Vinod, I found that the framework does have a chance to be NULL here. Seems that since the executor is not in framework->pending() at this time (it is removed from the pending queue at the beginning of _runTask()), so the executor can be removed. These shutdown executor/framework logic is really not easy to tell from a single glance, so I have done an experiment. I have uploaded a test and logs, it shows that the framework can be removed before __runTask() is called. I really hope you could take a look to see if I missed some stuff. Thank you! I think maybe I can add the task to the pending queue again before calling the containerizer->update(). - Yifan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22313/#review45858 ----------------------------------------------------------- On June 11, 2014, 9:32 a.m., Yifan Gu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/22313/ > ----------------------------------------------------------- > > (Updated June 11, 2014, 9:32 a.m.) > > > Review request for mesos, Ian Downes and Vinod Kone. > > > Bugs: MESOS-886 > https://issues.apache.org/jira/browse/MESOS-886 > > > Repository: mesos-git > > > Description > ------- > > Added __runTask() to wait for the completion of containerizer->update() and > check the result before sending RunTaskMessage. > > > Diffs > ----- > > src/slave/slave.hpp 34687e5 > src/slave/slave.cpp 643c088 > src/tests/slave_tests.cpp 2c8f183 > > Diff: https://reviews.apache.org/r/22313/diff/ > > > Testing > ------- > > SlaveTest, CancelTaskIfContainerizerFails > > Which tests that if the containerizer->update() return a Failure, the task > won't be launched and the scheduler will get TASK_LOST. > > make check > > > Thanks, > > Yifan Gu > >