-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16724/
-----------------------------------------------------------

(Updated Feb. 17, 2014, 4:23 p.m.)


Review request for mesos, Benjamin Hindman, Ben Mahler, Niklas Nielsen, and 
Vinod Kone.


Changes
-------

Added a unit test that starts a task/framework, then kills the task and shuts 
down the framework, leaving a completedFramework on the slave. After restarting 
the master, the slave reregisters with the new master and the completed 
framework is added to the new master's state. 
Master/slave state is read using find/substr on the state.json endpoint of 
each. A better approach would use a real json parser to get at the nested 
elements (to verify the executor/task status), and an even better approach 
would make the test a friend of Master/Slave so it can read the 
frameworks/completedFrameworks collections directly.


Bugs: MESOS-767
    https://issues.apache.org/jira/browse/MESOS-767


Repository: mesos-git


Description
-------

Added completed frameworks/tasks to slave re-registration.
Fixes MESOS-767.

Additional issues discovered during investigation:
- MESOS-905: Remove Framework.id in favor of FrameworkInfo.id
- MESOS-906: Last task in Completed Framework never graduates from
terminatedTasks to completedTasks.
- Completed frameworks/executors/tasks are stored in circular buffers,
and these may overflow in different orders on different slaves. 
BenH proposes an archive to replace these circular buffers.


Diffs (updated)
-----

  include/mesos/scheduler.hpp 2e4707e 
  src/master/master.hpp 7649737 
  src/master/master.cpp 77872ec 
  src/messages/messages.proto 922a8c4 
  src/slave/slave.cpp 2d21e16 
  src/tests/fault_tolerance_tests.cpp 60e06cc 
  src/tests/mesos.hpp d7bdaee 

Diff: https://reviews.apache.org/r/16724/diff/


Testing (updated)
-------

make check; manually failed-over a master, watched the slave reregister its 
completed frameworks, web UI shows completed tasks and stdout/stderr.
Added a new unit/integration test to verify the expected behavior.


Thanks,

Adam B

Reply via email to