[ https://issues.apache.org/jira/browse/HBASE-24360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bharath Vissapragada updated HBASE-24360: ----------------------------------------- Fix Version/s: 1.7.0 > RollingBatchRestartRsAction loses track of dead servers > ------------------------------------------------------- > > Key: HBASE-24360 > URL: https://issues.apache.org/jira/browse/HBASE-24360 > Project: HBase > Issue Type: Test > Components: integration tests > Affects Versions: 2.3.0 > Reporter: Nick Dimiduk > Assignee: Nick Dimiduk > Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.0, 1.7.0 > > > {{RollingBatchRestartRsAction}} doesn't handle failure cases when tracking > its list of dead servers. The original author believed that a failure to > restart would result in a retry. However, by removing the dead server from > the failed list prematurely, that state is lost, and retry of that server > never occurs. Because this action doesn't ever look back to the current state > of the cluster, relying only on its local state for the current action > invocation, it never realizes the abandoned server is still dead. Instead, be > more careful to only remove the dead server from the list when the > {{startRs}} invocation claims to have been successful. -- This message was sent by Atlassian Jira (v8.3.4#803005)