Thanks for your answer, it clarify it and let me know it. thanks a lot. 2017-09-01 21:35 GMT+08:00 Ilya Pronin <ipro...@twopensource.com>:
> Hey, > > I'm not sure I understood your question correctly. But AFAIK > recovery_agent_removal_limit flag is intended to limit the number of agents > that will be marked unreachable after the re-registration timeout. If the > master sees that it has to remove more agents than the limit allows, it > will failover. Otherwise, agents that have not yet re-registered will be > marked unreachable at slave_removal_rate_limit. Here's the code that does > that: > https://github.com/apache/mesos/blob/master/src/master/master.cpp#L1946 > > We no longer shutdown agents if they try to re-register after being marked > unreachable, so we can safely remove those agents from the registry. > However, it still might be a good signal for the operator to investigate > why a lot of agents did not re-register. > > On Fri, Sep 1, 2017 at 6:46 AM, tommy xiao <xia...@gmail.com> wrote: > > > toady i have a curious to read mesos source code for > > --recovery_agent_removal_limit. how does it working from source code. i > > have not found any useful logic for recovery_agent_removal_limit. anyone > > can do me favor? > > > > -- > > Deshi Xiao > > Twitter: xds2000 > > E-mail: xiaods(AT)gmail.com > > > > -- > Ilya Pronin > -- Deshi Xiao Twitter: xds2000 E-mail: xiaods(AT)gmail.com