[ https://issues.apache.org/jira/browse/HBASE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847148#comment-13847148 ]
Hadoop QA commented on HBASE-9047: ---------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618538/HBASE-9047-0.94-v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8153//console This message is automatically generated. > Tool to handle finishing replication when the cluster is offline > ---------------------------------------------------------------- > > Key: HBASE-9047 > URL: https://issues.apache.org/jira/browse/HBASE-9047 > Project: HBase > Issue Type: New Feature > Affects Versions: 0.96.0 > Reporter: Jean-Daniel Cryans > Assignee: Demai Ni > Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0 > > Attachments: HBASE-9047-0.94-v1.patch, HBASE-9047-0.94.9-v0.PATCH, > HBASE-9047-trunk-v0.patch, HBASE-9047-trunk-v1.patch, > HBASE-9047-trunk-v2.patch, HBASE-9047-trunk-v3.patch, > HBASE-9047-trunk-v4.patch, HBASE-9047-trunk-v4.patch, > HBASE-9047-trunk-v5.patch, HBASE-9047-trunk-v6.patch, > HBASE-9047-trunk-v7.patch, HBASE-9047-trunk-v7.patch > > > We're having a discussion on the mailing list about replicating the data on a > cluster that was shut down in an offline fashion. The motivation could be > that you don't want to bring HBase back up but still need that data on the > slave. > So I have this idea of a tool that would be running on the master cluster > while it is down, although it could also run at any time. Basically it would > be able to read the replication state of each master region server, finish > replicating what's missing to all the slave, and then clear that state in > zookeeper. > The code that handles replication does most of that already, see > ReplicationSourceManager and ReplicationSource. Basically when > ReplicationSourceManager.init() is called, it will check all the queues in ZK > and try to grab those that aren't attached to a region server. If the whole > cluster is down, it will grab all of them. > The beautiful thing here is that you could start that tool on all your > machines and the load will be spread out, but that might not be a big concern > if replication wasn't lagging since it would take a few seconds to finish > replicating the missing data for each region server. > I'm guessing when starting ReplicationSourceManager you'd give it a fake > region server ID, and you'd tell it not to start its own source. > FWIW the main difference in how replication is handled between Apache's HBase > and Facebook's is that the latter is always done separately of HBase itself. > This jira isn't about doing that. -- This message was sent by Atlassian JIRA (v6.1.4#6159)