> On Nov. 28, 2017, 7:01 p.m., Ilya Pronin wrote: > > src/master/master.cpp > > Lines 6789 (patched) > > <https://reviews.apache.org/r/64098/diff/3/?file=1902267#file1902267line6789> > > > > I think this is not specific to unreachable agents. Can be an agent > > that was recovered after failover.
Ilya, I agree the reason needs to be changed based on whether or not the agent was unreachable. Also, Yan and I dicussed more about the agent re-registeration scenarios in which the master should do a status update. If the master undergoes a failover then the current approach will make the master do status updates for all tasks on re-registering agents which will make the make the critical path of agent re-registeration slower. One good alternative was to do status updates for only unreachable agents. Since master already sent a TASK_LOST/TASK_UNREACHABLE for these so there should definitely be a followup. Most of the frameworks today already do frequent reconciliations upon re-registering with master so doing explicit status updates for re-registering agents due to failover seemed a bit unnecessary. How do you feel about the changed approach? - Megha ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/64098/#review191979 ----------------------------------------------------------- On Nov. 28, 2017, 12:55 a.m., Megha Sharma wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/64098/ > ----------------------------------------------------------- > > (Updated Nov. 28, 2017, 12:55 a.m.) > > > Review request for mesos, Ilya Pronin, James Peach, and Jiang Yan Xu. > > > Bugs: MESOS-6406 > https://issues.apache.org/jira/browse/MESOS-6406 > > > Repository: mesos > > > Description > ------- > > Master will send task status updates to frameworks when an agent > re-registers. > > > Diffs > ----- > > src/master/master.cpp 2ddd67ada3731803b00883b6a1f32b20c1bb238f > src/tests/master_allocator_tests.cpp > 3400d70bb0ba564eac43c4639eee0efd4d8059e6 > src/tests/master_tests.cpp 9c450b9f592d9e09a468f537d9b500e97acc636b > src/tests/partition_tests.cpp e49c474167076b4136a161ed29b11db9a13455a7 > src/tests/persistent_volume_tests.cpp > acfeac16884b00581a3523607ff26f44f6dca53a > src/tests/slave_recovery_tests.cpp c864aa92d9ff128a89dbc25653385de25653f56a > src/tests/upgrade_tests.cpp 7f434dbba858f636719eec24e92b306b76430c4c > > > Diff: https://reviews.apache.org/r/64098/diff/4/ > > > Testing > ------- > > with make check > > > Thanks, > > Megha Sharma > >