> On Nov. 28, 2017, 7:01 p.m., Ilya Pronin wrote:
> > src/master/master.cpp
> > Lines 6789 (patched)
> > <https://reviews.apache.org/r/64098/diff/3/?file=1902267#file1902267line6789>
> >
> >     I think this is not specific to unreachable agents. Can be an agent 
> > that was recovered after failover.

Ilya, I agree the reason needs to be changed based on whether or not the agent 
was unreachable. Also, Yan and I dicussed more about the agent re-registeration 
scenarios in which the master should do a status update. If the master 
undergoes a failover then the current approach will make the master do status 
updates for all tasks on re-registering agents which will make the make the 
critical path of agent re-registeration slower. One good alternative was to do 
status updates for only unreachable agents. Since master already sent a 
TASK_LOST/TASK_UNREACHABLE for these so there should definitely be a followup. 
Most of the frameworks today already do frequent reconciliations upon 
re-registering with master so doing explicit status updates for re-registering 
agents due to failover seemed a bit unnecessary. How do you feel about the 
changed approach?


- Megha


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64098/#review191979
-----------------------------------------------------------


On Nov. 28, 2017, 12:55 a.m., Megha Sharma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64098/
> -----------------------------------------------------------
> 
> (Updated Nov. 28, 2017, 12:55 a.m.)
> 
> 
> Review request for mesos, Ilya Pronin, James Peach, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6406
>     https://issues.apache.org/jira/browse/MESOS-6406
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Master will send task status updates to frameworks when an agent
> re-registers.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 2ddd67ada3731803b00883b6a1f32b20c1bb238f 
>   src/tests/master_allocator_tests.cpp 
> 3400d70bb0ba564eac43c4639eee0efd4d8059e6 
>   src/tests/master_tests.cpp 9c450b9f592d9e09a468f537d9b500e97acc636b 
>   src/tests/partition_tests.cpp e49c474167076b4136a161ed29b11db9a13455a7 
>   src/tests/persistent_volume_tests.cpp 
> acfeac16884b00581a3523607ff26f44f6dca53a 
>   src/tests/slave_recovery_tests.cpp c864aa92d9ff128a89dbc25653385de25653f56a 
>   src/tests/upgrade_tests.cpp 7f434dbba858f636719eec24e92b306b76430c4c 
> 
> 
> Diff: https://reviews.apache.org/r/64098/diff/4/
> 
> 
> Testing
> -------
> 
> with make check
> 
> 
> Thanks,
> 
> Megha Sharma
> 
>

Reply via email to