-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23868/#review48685
-----------------------------------------------------------


Chatted with BenM.

We've been having this problem: doReliableRegistration calls itself and thus 
forms a loop which stops when the a "registered" message is received. The 
starting point of the loop starts when a new master is detected so whenever a 
new master is detected with the loop still running, a new loop is created and 
this can go on and on. This doesn't lead to incorrect slave state but generates 
more events in the slave process and consume more CPU/MEM. We could probalby 
come up with some "loop" abstraction to handle these tasks safely. This is not 
a big concern for now as the doReliableRegistration loop is not a tight one and 
condition under which multiple loops a created is relatively rare.


src/master/constants.hpp
<https://reviews.apache.org/r/23868/#comment85433>

    What is this for?



src/slave/slave.cpp
<https://reviews.apache.org/r/23868/#comment85431>

    Should we check "pingTimer.timeout().expired()?"
    
    If the slave receives a ping before the timer times out but its queue 
backed up and thus the timer isn't cancelled. The timer then times out and 
dispatches a redetect() that is executed after ping(), we don't really need to 
redetect right?


- Jiang Yan Xu


On July 23, 2014, 7:55 p.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23868/
> -----------------------------------------------------------
> 
> (Updated July 23, 2014, 7:55 p.m.)
> 
> 
> Review request for mesos, Vinod Kone and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-1529
>     https://issues.apache.org/jira/browse/MESOS-1529
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> This is the first step in MESOS-1529.
> 
> If we get into a situation where the slave thinks it is registered, but the 
> master does not, then the slave should re-register. This situation can be 
> often be detected on the slave side when the slave is no longer receiving 
> pings from the master.
> 
> 
> Diffs
> -----
> 
>   src/master/constants.hpp 8ace682bc58e4fae65038906a4abec5879f35020 
>   src/slave/constants.hpp 97dc1b30fa81000ea60223c4059a0a64d27e91c4 
>   src/slave/constants.cpp a75b1ef8eddeb55350810b36ac35136d2e5d6f9d 
>   src/slave/slave.hpp a896bb66db5d8cd27ef02b6498c9db93cb0d525f 
>   src/slave/slave.cpp 1d5691836822c8587e1aa8ed24860a8012c67a6e 
>   src/tests/slave_tests.cpp e45255a6f699e51bf09397da95a5a11edbabe591 
> 
> Diff: https://reviews.apache.org/r/23868/diff/
> 
> 
> Testing
> -------
> 
> Added tests.
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>

Reply via email to