> On June 30, 2014, 6:44 a.m., Adam B wrote: > > src/master/master.hpp, lines 852-854 > > <https://reviews.apache.org/r/23147/diff/1/?file=620067#file620067line852> > > > > // We mark a slave 'inactive' ... > > bool active; > > Alexandra Sava wrote: > I have some concerns in changing this. As the comment says, a slave is > marked as 'disconnected' when it is checkpointing. I would like to use the > 'deactive' key for slaves for which no resource offers are being sent (this > is for MESOS-1476 ticket). > > In Master::statusUpdateAcknowledgement method, you can't send a status > update ACK to a disconnected slave. Though, this should be possible for a > 'deactivated' slave (for which no resource offers are being sent, but is > currently running some tasks that had been lunched before the slave was > marked as deactivated -> TODO for MESOS-1476 ticket ). > > > Adam B wrote: > Hmm. I opine that 'disconnected' is the proper term for a slave that has > lost its connection to the master, and which will have to reregister upon > reconnecting. Also, 'deactivated' is the proper term for a slave for which > the allocator will no longer send any resource offers, and any outstanding > offers will no longer be considered valid. I could even believe that > disconnecting a slave with Master::disconnect(Slave) would involve > deactivating it in the allocator with allocator->slaveDeactivated(). So far, > the two terms/states have been tightly coupled. Now that you propose a > separation of states (for MESOS-1476), you must be careful when considering > one of the states to also consider the other. I think it's safe to say that a > 'disconnected' slave should always also be 'deactivated', but that a slave > can also be 'deactivated' manually, without getting disconnected first. A > connected-but-deactivated slave may even be able to reactivate without > needing to reauthenticate/reregister. But what if a deactivated slave does try to reregister? Does it restart into deactivated state? Can you call KillTask on a connected-but-deactivated slave? > > TLDR: You can keep 'bool disconnected' and even 'disconnect(Slave)' as is > for now, but be very careful when introducing a separate 'deactivated' state > for MESOS-1476.
If a connected-deactivated slave tries to re-register, it will still be deactivated. So the state of the slave (activated/deactivated) is kept in the registry. The master will consider a slave as deactivated as long as that slave will be in the registry. If for some reason, the slave is removed from the registry, and after that, at some point it will re-register, the master will consider it activated. A connected-but-deactivated slave is just like a normal slave but the master will no longer send resource offers that belongs to it. If an operator deactivates a slave, that slave will continue running the tasks that had been lunched on it before deactivation. So yes.. you can KillTask on a connected-but-deactivated slave. - Alexandra ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23147/#review46964 ----------------------------------------------------------- On July 11, 2014, 1:28 p.m., Alexandra Sava wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/23147/ > ----------------------------------------------------------- > > (Updated July 11, 2014, 1:28 p.m.) > > > Review request for mesos, Adam B and Ben Mahler. > > > Bugs: MESOS-1188 > https://issues.apache.org/jira/browse/MESOS-1188 > > > Repository: mesos-git > > > Description > ------- > > The existing terminology is confusing both for "slaves.deactivated" and > "frameworks.activated". Currently a deactivated slave actually represents a > removed/shutdown slave and "frameworks.activated" map holds both activated > and deactivated frameworks. > In order to make things look clear, rename the following: > * master.slaves.deactivated -> master.slaves.removed > * master.slaves.activated -> master.slaves.registered > * master.frameworks.activated -> master.frameworks.registered > * allocator.slaveDisconnect -> allocator.slaveDeactivate > * allocator.slaveReconnected -> allocator.slaveReactivated > > > Diffs > ----- > > src/master/allocator.hpp 1cd573477b609bb69264f16156a4004ecac672a7 > src/master/constants.hpp 2daa9b004ab0cc79773730350369f66315356cad > src/master/constants.cpp e9e5e67f890f3399c24637c0f021d656dfe51118 > src/master/hierarchical_allocator_process.hpp > 1765e7035bdda4c28e79d74c92e77dcc99759001 > src/master/http.cpp 4fba007bfb9909056dc85f9dc04483994d662740 > src/master/master.hpp 8641f2dfe711481133869f876715b56728dc1bc0 > src/master/master.cpp 86b147fce153fe3a241dbd841e033f2b7ca07b01 > src/tests/fault_tolerance_tests.cpp > ac65050bec5720b982f53d4dd6797cc3dee285dc > src/tests/master_authorization_tests.cpp > 0fdf464cc4a562afec276ec604205af3b56636de > src/tests/mesos.hpp ae38a13d8b329f6e27813776e0d2f2b56605d0eb > src/tests/slave_recovery_tests.cpp 582f52d73eba0e3ab089ec573d9a6c43bff0339e > > Diff: https://reviews.apache.org/r/23147/diff/ > > > Testing > ------- > > > Thanks, > > Alexandra Sava > >
