[ https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021370#comment-15021370 ]
Jerry He commented on HBASE-13153: ---------------------------------- A general comment. Disclaimer: I have not closely read thru the code, but I have read the doc and follow thru the comments roughly. On the source cluster: The source replication handler --> sends WALs entries to the peer, including bulkload entries, synchronously blocking for response. On the peer cluster: The peer region server RPC handler --> sees bulkload WAL entry --> invokes bulkload client RPC to another region server --> synchronously blocking Another region server RPC handler --> holds region write lock --> transfers files to be bulk loaded into the region from remote cluster synchronously Multiple handlers on the peer cluster can potentially be blocked. Multiple regions can be blocked from reading as well. In the normal replication case, the granularity is a few WAL entries. The granularity of failure is at the entire file level with bulk load. This is probably going to be ok in low network latency. But what happens when the network latency is less ideal? In an active-active case? > Bulk Loaded HFile Replication > ----------------------------- > > Key: HBASE-13153 > URL: https://issues.apache.org/jira/browse/HBASE-13153 > Project: HBase > Issue Type: New Feature > Components: Replication > Reporter: sunhaitao > Assignee: Ashish Singhi > Fix For: 2.0.0 > > Attachments: HBASE-13153-v1.patch, HBASE-13153-v10.patch, > HBASE-13153-v11.patch, HBASE-13153-v12.patch, HBASE-13153-v13.patch, > HBASE-13153-v14.patch, HBASE-13153-v15.patch, HBASE-13153-v16.patch, > HBASE-13153-v17.patch, HBASE-13153-v18.patch, HBASE-13153-v2.patch, > HBASE-13153-v3.patch, HBASE-13153-v4.patch, HBASE-13153-v5.patch, > HBASE-13153-v6.patch, HBASE-13153-v7.patch, HBASE-13153-v8.patch, > HBASE-13153-v9.patch, HBASE-13153.patch, HBase Bulk Load > Replication-v1-1.pdf, HBase Bulk Load Replication-v2.pdf, HBase Bulk Load > Replication-v3.pdf, HBase Bulk Load Replication.pdf, HDFS_HA_Solution.PNG > > > Currently we plan to use HBase Replication feature to deal with disaster > tolerance scenario.But we encounter an issue that we will use bulkload very > frequently,because bulkload bypass write path, and will not generate WAL, so > the data will not be replicated to backup cluster. It's inappropriate to > bukload twice both on active cluster and backup cluster. So i advise do some > modification to bulkload feature to enable bukload to both active cluster and > backup cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)