> On June 29, 2014, 11:44 p.m., Adam B wrote: > > src/master/master.hpp, lines 852-854 > > <https://reviews.apache.org/r/23147/diff/1/?file=620067#file620067line852> > > > > // We mark a slave 'inactive' ... > > bool active; > > Alexandra Sava wrote: > I have some concerns in changing this. As the comment says, a slave is > marked as 'disconnected' when it is checkpointing. I would like to use the > 'deactive' key for slaves for which no resource offers are being sent (this > is for MESOS-1476 ticket). > > In Master::statusUpdateAcknowledgement method, you can't send a status > update ACK to a disconnected slave. Though, this should be possible for a > 'deactivated' slave (for which no resource offers are being sent, but is > currently running some tasks that had been lunched before the slave was > marked as deactivated -> TODO for MESOS-1476 ticket ). > > > Adam B wrote: > Hmm. I opine that 'disconnected' is the proper term for a slave that has > lost its connection to the master, and which will have to reregister upon > reconnecting. Also, 'deactivated' is the proper term for a slave for which > the allocator will no longer send any resource offers, and any outstanding > offers will no longer be considered valid. I could even believe that > disconnecting a slave with Master::disconnect(Slave) would involve > deactivating it in the allocator with allocator->slaveDeactivated(). So far, > the two terms/states have been tightly coupled. Now that you propose a > separation of states (for MESOS-1476), you must be careful when considering > one of the states to also consider the other. I think it's safe to say that a > 'disconnected' slave should always also be 'deactivated', but that a slave > can also be 'deactivated' manually, without getting disconnected first. A > connected-but-deactivated slave may even be able to reactivate without > needing to reauthenticate/reregister. But what if a deactivated slave does try to reregister? Does it restart into deactivated state? Can you call KillTask on a connected-but-deactivated slave? > > TLDR: You can keep 'bool disconnected' and even 'disconnect(Slave)' as is > for now, but be very careful when introducing a separate 'deactivated' state > for MESOS-1476. > > Alexandra Sava wrote: > If a connected-deactivated slave tries to re-register, it will still be > deactivated. So the state of the slave (activated/deactivated) is kept in the > registry. The master will consider a slave as deactivated as long as that > slave will be in the registry. If for some reason, the slave is removed from > the registry, and after that, at some point it will re-register, the master > will consider it activated. > > A connected-but-deactivated slave is just like a normal slave but the > master will no longer send resource offers that belongs to it. If an operator > deactivates a slave, that slave will continue running the tasks that had been > lunched on it before deactivation. So yes.. you can KillTask on a > connected-but-deactivated slave.
SGTM. Just want to make sure we're considering all these cases now that we're splitting the 'deactivated' state into two states. - Adam ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23147/#review46964 ----------------------------------------------------------- On July 11, 2014, 6:28 a.m., Alexandra Sava wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/23147/ > ----------------------------------------------------------- > > (Updated July 11, 2014, 6:28 a.m.) > > > Review request for mesos, Adam B and Ben Mahler. > > > Bugs: MESOS-1188 > https://issues.apache.org/jira/browse/MESOS-1188 > > > Repository: mesos-git > > > Description > ------- > > The existing terminology is confusing both for "slaves.deactivated" and > "frameworks.activated". Currently a deactivated slave actually represents a > removed/shutdown slave and "frameworks.activated" map holds both activated > and deactivated frameworks. > In order to make things look clear, rename the following: > * master.slaves.deactivated -> master.slaves.removed > * master.slaves.activated -> master.slaves.registered > * master.frameworks.activated -> master.frameworks.registered > * allocator.slaveDisconnect -> allocator.slaveDeactivate > * allocator.slaveReconnected -> allocator.slaveReactivated > > > Diffs > ----- > > src/master/allocator.hpp 1cd573477b609bb69264f16156a4004ecac672a7 > src/master/constants.hpp 2daa9b004ab0cc79773730350369f66315356cad > src/master/constants.cpp e9e5e67f890f3399c24637c0f021d656dfe51118 > src/master/hierarchical_allocator_process.hpp > 1765e7035bdda4c28e79d74c92e77dcc99759001 > src/master/http.cpp 4fba007bfb9909056dc85f9dc04483994d662740 > src/master/master.hpp 8641f2dfe711481133869f876715b56728dc1bc0 > src/master/master.cpp 86b147fce153fe3a241dbd841e033f2b7ca07b01 > src/tests/fault_tolerance_tests.cpp > ac65050bec5720b982f53d4dd6797cc3dee285dc > src/tests/master_authorization_tests.cpp > 0fdf464cc4a562afec276ec604205af3b56636de > src/tests/mesos.hpp ae38a13d8b329f6e27813776e0d2f2b56605d0eb > src/tests/slave_recovery_tests.cpp 582f52d73eba0e3ab089ec573d9a6c43bff0339e > > Diff: https://reviews.apache.org/r/23147/diff/ > > > Testing > ------- > > > Thanks, > > Alexandra Sava > >
