----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51653/ -----------------------------------------------------------
(Updated Sept. 8, 2016, 3:49 p.m.) Review request for mesos and Vinod Kone. Changes ------- Simplify test. Bugs: MESOS-5965 https://issues.apache.org/jira/browse/MESOS-5965 Repository: mesos Description ------- Now that we wait for the agent to be removed from the registry before stopping the SlaveObserver, it is possible for an agent to fail health checks multiple times if the registry operation takes longer than `agent_ping_timeout`. This commit updates the master logic to handle this by ignoring health check failures while the registry operation to mark the agent unreachable is still in progress. Diffs (updated) ----- src/master/master.cpp 1dcce6cd66804990af238176c61aca03bb5c9471 src/tests/partition_tests.cpp f3142ad8d50daafcdb70ad9dbb2772f8ba30db00 Diff: https://reviews.apache.org/r/51653/diff/ Testing ------- make check on OSX and Linux. `./src/mesos-tests --gtest_filter="Strict/PartitionTest.FailHealthChecksTwice/0" --gtest_repeat=1000 --gtest_break_on_failure` Thanks, Neil Conway