[jira] [Commented] (HBASE-9360) Enable 0.94 -> 0.96 replication to minimize upgrade down time

Jeffrey Zhong (JIRA) Mon, 09 Sep 2013 16:59:04 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762484#comment-13762484
 ]


Jeffrey Zhong commented on HBASE-9360:
--------------------------------------

[~jdcryans] You mentioned another valid scenario to have 0.94->0.96 replication 
support. Minimizing the upgrade time(or pain) is just one of the driving 
factors. Thanks.
                
> Enable 0.94 -> 0.96 replication to minimize upgrade down time
> -------------------------------------------------------------
>
>                 Key: HBASE-9360
>                 URL: https://issues.apache.org/jira/browse/HBASE-9360
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: migration
>    Affects Versions: 0.98.0, 0.96.0
>            Reporter: Jeffrey Zhong
>
> As we know 0.96 is a singularity release, as of today a 0.94 hbase user has 
> to do in-place upgrade: make corresponding client changes, recompile client 
> application code, fully shut down existing 0.94 hbase cluster, deploy 0.96 
> binary, run upgrade script and then start the upgraded cluster. You can image 
> the down time will be extended if something is wrong in between. 
> To minimize the down time, another possible way is to setup a secondary 0.96 
> cluster and then setup replication between the existing 0.94 cluster and the 
> new 0.96 slave cluster. Once the 0.96 cluster is synced, a user can switch 
> the traffic to the 0.96 cluster and decommission the old one.
> The ideal steps will be:
> 1) Setup a 0.96 cluster
> 2) Setup replication between a running 0.94 cluster to the newly created 0.96 
> cluster
> 3) Wait till they're in sync in replication
> 4) Starts duplicated writes to both 0.94 and 0.96 clusters(could stop 
> relocation now)
> 5) Forward read traffic to the slave 0.96 cluster
> 6) After a certain period, stop writes to the original 0.94 cluster if 
> everything is good and completes upgrade
> To get us there, there are two tasks:
> 1) Enable replication from 0.94 -> 0.96
> I've run the idea with [~jdcryans], [~devaraj] and [~ndimiduk]. Currently it 
> seems the best approach is to build a very similar service or on top of 
> https://github.com/NGDATA/hbase-indexer/tree/master/hbase-sep with support 
> three commands replicateLogEntries, multi and delete. Inside the three 
> commands, we just pass down the corresponding requests to the destination 
> 0.96 cluster as a bridge. The reason to support the multi and delete is for 
> CopyTable to copy data from a 0.94 cluster to a 0.96 one.
> The other approach is to provide limited support of 0.94 RPC protocol in 
> 0.96. While an issue on this is that a 0.94 client needs to talk to zookeeper 
> firstly before it can connect to a 0.96 region server. Therefore, we need a 
> faked Zookeeper setup in front of a 0.96 cluster for a 0.94 client to 
> connect. It may also pollute 0.96 code base with 0.94 RPC code.
> 2) To support writes to a 0.96 cluster and a 0.94 at the same time, we need 
> to load both hbase clients into one single JVM using different class loader.
> Let me know if you think this is worth to do and any better approach we could 
> take.
> Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9360) Enable 0.94 -> 0.96 replication to minimize upgrade down time

Reply via email to