----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29507/ -----------------------------------------------------------
(Updated Feb. 19, 2015, 12:10 a.m.) Review request for mesos, Ben Mahler and Niklas Nielsen. Changes ------- Merged in subsequent patch to make slave use master's ping timeout from SlaveRe[re]gisteredMessage, plus unit tests. Bugs: MESOS-2110 https://issues.apache.org/jira/browse/MESOS-2110 Repository: mesos Description (updated) ------- Added new --slave_ping_timeout and --max_slave_ping_timeouts flags to mesos-master to supplement the DEFAULT_SLAVE_PING_TIMEOUT (15secs) and DEFAULT_MAX_SLAVE_PING_TIMEOUTS (5). These can be extended if slaves are expected/allowed to be down for longer than a minute or two. Slave will receive master's ping timeout in SlaveRe[re]gisteredMessage. Beware that this affects recovery from network timeouts as well as actual slave node/process failover. Diffs (updated) ----- src/master/constants.hpp ad3fe81 src/master/constants.cpp d3d0f71 src/master/flags.hpp 51a6059 src/master/master.cpp f10a3cf src/messages/messages.proto 58484ae src/slave/constants.hpp 12d6e92 src/slave/constants.cpp 7868bef src/slave/slave.hpp 91dae10 src/slave/slave.cpp aec9525 src/tests/fault_tolerance_tests.cpp efa5c57 src/tests/partition_tests.cpp eb16a58 src/tests/slave_recovery_tests.cpp 8210c52 src/tests/slave_tests.cpp 153d9d6 Diff: https://reviews.apache.org/r/29507/diff/ Testing (updated) ------- Manually tested slave failover/shutdown with master using different --slave_ping_timeout and --max_slave_ping_timeouts. Ran unit tests with shorter non-default values for ping timeouts. `make check` with new unit tests: ShortPingTimeoutUnreachableMaster and ShortPingTimeoutUnreachableSlave Thanks, Adam B