Guanghao Zhang created HBASE-17303: -------------------------------------- Summary: Let master to check and transfer the dead rs's replication queues Key: HBASE-17303 URL: https://issues.apache.org/jira/browse/HBASE-17303 Project: HBase Issue Type: Bug Components: Replication Reporter: Guanghao Zhang
Dump replication queues result from our cluster. {code} Found 8 deleted queues, run hbck -fixReplication in order to remove the deleted replication queues hostname,24610,1481528189915/80-hostname,24620,1476784763605 hostname,24620,1476784763605/70-hostname,24630,1470418208092-hostname,24600,1476773709589 hostname,24630,1481528526258/17000-hostname,24620,1470044455538-hostname,24630,1470037674231-hostname,24600,1476773708489-hostname,24620,1476784763605 hostname,24620,1481528358531/70-hostname,24600,1476773709589-hostname,24620,1476784763605 hostname,24600,1481528021595/70-hostname,24630,1470421093464-hostname,24630,1476773708939-hostname,24610,1476779010928-hostname,24620,1476784747260 hostname,24600,1481528021595/17000-hostname,24620,1476784763605 hostname,24600,1481528021595/17000-hostname,24630,1475381530644-hostname,24600,1476773709589-hostname,24620,1476784763605 hostname,24600,1481528021595/17000-hostname,24600,1476773709589-hostname,24620,1476784763605 Found 2 dead regionservers, restart one regionserver to transfer the queues of dead regionservers hostname,24600,1481547616148 hostname,24620,1476784763605 {code} Now for dead rs's replication znode, you need restart one regionserver to transfer the replication queues of dead regionservers. Same idea with HBASE-16336, we can let master to periodically check the dead rs znode, too. And send the transfer replication queues request to any regionserver. Then the dead rs's replication queues can be transfer automatically and don't need to wait a regionserver restart. Any suggestions are welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)