[ https://issues.apache.org/jira/browse/MESOS-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383044#comment-14383044 ]
Benjamin Mahler commented on MESOS-2392: ---------------------------------------- Fixed an issue with these patches: {noformat} commit b9a8dcac78a66bb880955a47f164850eaf7aa4cc Author: Benjamin Mahler <benjamin.mah...@gmail.com> Date: Thu Mar 26 16:39:32 2015 -0700 Fixed an incorrect metric for slave removals. Review: https://reviews.apache.org/r/32555 {noformat} > Rate limit slaves removals during master recovery. > -------------------------------------------------- > > Key: MESOS-2392 > URL: https://issues.apache.org/jira/browse/MESOS-2392 > Project: Mesos > Issue Type: Improvement > Components: master > Reporter: Benjamin Mahler > Assignee: Benjamin Mahler > Labels: twitter > Fix For: 0.23.0 > > > Much like we rate limit slave removals in the common path (MESOS-1148), we > need to rate limit slave removals that occur during master recovery. When a > master recovers and is using a strict registry, slaves that do not > re-register within a timeout will be removed. > Currently there is a safeguard in place to abort when too many slaves have > not re-registered. However, in the case of a transient partition, we don't > want to remove large sections of slaves without rate limiting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)