[ https://issues.apache.org/jira/browse/HBASE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jean-Daniel Cryans updated HBASE-3596: -------------------------------------- Attachment: HBASE-3596.patch Simple patch that adds a configurable time to sleep before trying to lock a region server. > [replication] Wait a few seconds before transferring queues > ------------------------------------------------------------ > > Key: HBASE-3596 > URL: https://issues.apache.org/jira/browse/HBASE-3596 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.90.1 > Reporter: Jean-Daniel Cryans > Assignee: Jean-Daniel Cryans > Fix For: 0.90.2 > > Attachments: HBASE-3596.patch > > > ReplicationSourceManager.transferQueues is running a little too fast at the > moment and this has the bad side effect of making us run into HBASE-2611 at > almost every cluster restart. The reason is that some servers might shut down > faster than others so that the last RS that are notified will at the same > time see their friends dying, and will try to pick their queues. What happens > then is that they also get told to shutdown and might be able to close their > ZK session before the queue transfer process is completed, which is what 2611 > is about. > Currently the only to fix to that is to delete the lock znode by hand and > bounce a region server so that it picks up the queue on startup. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira