[ https://issues.apache.org/jira/browse/HBASE-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022326#comment-13022326 ]
Subbu M Iyer commented on HBASE-3580: ------------------------------------- thanks JD for taking a closer look. I reviewed your patch and it looks good to me. +1 for a commit. > Remove RS from DeadServer when new instance checks in > ----------------------------------------------------- > > Key: HBASE-3580 > URL: https://issues.apache.org/jira/browse/HBASE-3580 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.90.0 > Reporter: Jean-Daniel Cryans > Assignee: Jean-Daniel Cryans > Fix For: 0.90.3 > > Attachments: > HBASE-3580-Remove-RS-from-DeadServer-when-new-instance-checks-in.patch, > HBASE-3580-v3.patch, > HBASE-3580_-_Remove_RS_from_dead_server_when_the_RS_when_new_instance_checks_in3.patch > > > Keeping the servers in DeadServer until it reaches some maximum isn't super > friendly, it confuses even the best of our users: > {quote} > 09:27 < gbowyer> Hi all, I have apparently three dead RS in my cluster, I > cannot find references to them in HDFS or in ZK, how do I still report dead RS > 09:27 < gbowyer> also the same nodes are reported as live region servers > {quote} > The subtil startcode difference can be hard to catch, also this behavior > differs from 0.20 (so old users get confused, like I did when debugging this > problem) and it also differs from Hadoop's handling of dead DataNodes. It was > introduced in HBASE-3282. > I think this should be improved by doing like Hadoop does, removing the RS > from DeadServers when a new instance with the same hostname+port checks in. > Stack says we should do it in ServerManager.checkIsDead -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira