Neil Conway created MESOS-4048: ---------------------------------- Summary: Consider unifying slave timeout behavior between steady state and master failover Key: MESOS-4048 URL: https://issues.apache.org/jira/browse/MESOS-4048 Project: Mesos Issue Type: Improvement Components: master, slave Reporter: Neil Conway Priority: Minor
Currently, there are two timeouts that control what happens when an agent is partitioned from the master: 1. {{max_slave_ping_timeouts}} + {{slave_ping_timeout}} controls how long the master waits before declaring a slave to be dead in the "steady state" 2. {{slave_reregister_timeout}} controls how long the master waits for a slave to reregister after master failover. It is unclear whether these two cases really merit being treated differently -- it might be simpler for operators to configure a single timeout that controls how long the master waits before declaring that a slave is dead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)