> On June 17, 2014, 12:28 a.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 1219-1224
> > <https://reviews.apache.org/r/22313/diff/7/?file=607037#file607037line1219>
> >
> >     Is this possible? AFAICT, since the task was added to the executor the 
> > executor shouldn't be removed between _runTask() and __runTask(). Even if 
> > the executor terminates in between, this task should've been marked 
> > 'terminated' but not 'completed' (i.e., waiting for an ACK) and hence the 
> > executor won't be removed from the framework's map. Since there is a 
> > pending executor, the framework shouldn't be removed.
> >     
> >     So this can be a CHECK_NONTULL(framework) with a comment on why it can 
> > be a check.
> 
> Yifan Gu wrote:
>     Good point! I am currently trying to add a test to kill the framework 
> before the containerizer->update() returns to test this. Thanks for pointing 
> out!

Hi Vinod, I found that the framework does have a chance to be NULL here. Seems 
that since the executor is not in framework->pending() at this time (it is 
removed from the pending queue at the beginning of _runTask()), so the executor 
can be removed.

These shutdown executor/framework logic is really not easy to tell from a 
single glance, so I have done an experiment.

I have uploaded a test and logs, it shows that the framework can be removed 
before __runTask() is called. 
I really hope you could take a look to see if I missed some stuff. Thank you!

I think maybe I can add the task to the pending queue again before calling the 
containerizer->update().


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22313/#review45858
-----------------------------------------------------------


On June 11, 2014, 9:32 a.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22313/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 9:32 a.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-886
>     https://issues.apache.org/jira/browse/MESOS-886
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added __runTask() to wait for the completion of containerizer->update() and 
> check the result before sending RunTaskMessage.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e5 
>   src/slave/slave.cpp 643c088 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22313/diff/
> 
> 
> Testing
> -------
> 
> SlaveTest, CancelTaskIfContainerizerFails
> 
> Which tests that if the containerizer->update() return a Failure, the task 
> won't be launched and the scheduler will get TASK_LOST.
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>

Reply via email to