[jira] [Commented] (HBASE-9047) Tool to handle finishing replication when the cluster is offline

Hadoop QA (JIRA) Thu, 12 Dec 2013 20:31:00 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847148#comment-13847148
 ]


Hadoop QA commented on HBASE-9047:
----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12618538/HBASE-9047-0.94-v1.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified tests.

    {color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8153//console

This message is automatically generated.

> Tool to handle finishing replication when the cluster is offline
> ----------------------------------------------------------------
>
>                 Key: HBASE-9047
>                 URL: https://issues.apache.org/jira/browse/HBASE-9047
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.96.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Demai Ni
>             Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0
>
>         Attachments: HBASE-9047-0.94-v1.patch, HBASE-9047-0.94.9-v0.PATCH, 
> HBASE-9047-trunk-v0.patch, HBASE-9047-trunk-v1.patch, 
> HBASE-9047-trunk-v2.patch, HBASE-9047-trunk-v3.patch, 
> HBASE-9047-trunk-v4.patch, HBASE-9047-trunk-v4.patch, 
> HBASE-9047-trunk-v5.patch, HBASE-9047-trunk-v6.patch, 
> HBASE-9047-trunk-v7.patch, HBASE-9047-trunk-v7.patch
>
>
> We're having a discussion on the mailing list about replicating the data on a 
> cluster that was shut down in an offline fashion. The motivation could be 
> that you don't want to bring HBase back up but still need that data on the 
> slave.
> So I have this idea of a tool that would be running on the master cluster 
> while it is down, although it could also run at any time. Basically it would 
> be able to read the replication state of each master region server, finish 
> replicating what's missing to all the slave, and then clear that state in 
> zookeeper.
> The code that handles replication does most of that already, see 
> ReplicationSourceManager and ReplicationSource. Basically when 
> ReplicationSourceManager.init() is called, it will check all the queues in ZK 
> and try to grab those that aren't attached to a region server. If the whole 
> cluster is down, it will grab all of them.
> The beautiful thing here is that you could start that tool on all your 
> machines and the load will be spread out, but that might not be a big concern 
> if replication wasn't lagging since it would take a few seconds to finish 
> replicating the missing data for each region server.
> I'm guessing when starting ReplicationSourceManager you'd give it a fake 
> region server ID, and you'd tell it not to start its own source.
> FWIW the main difference in how replication is handled between Apache's HBase 
> and Facebook's is that the latter is always done separately of HBase itself. 
> This jira isn't about doing that.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Commented] (HBASE-9047) Tool to handle finishing replication when the cluster is offline

Reply via email to