[jira] [Commented] (HBASE-13153) enable bulkload to support replication

Ted Yu (JIRA) Thu, 27 Aug 2015 03:18:12 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716421#comment-14716421
 ]


Ted Yu commented on HBASE-13153:
--------------------------------

w.r.t. HFileReplicationEndPoint :
bq. After every configurable interval or max request size limit
Can you describe how the max request size limit would be monitored ?

bq. Peer cluster RS will receive the RPC request having multiple hfile paths
HFile paths are in ZK. Do we need to send the paths in RPC ?

bq. Peer RS will send the response with Success OR Failure paths
The response can be sent before HFile splitting is completed, right ?

bq. Inside hfiles node, there will be children node for every bulk loaded hfile 
name and hfile path as its data.
Could there be collision between HFile names ?

bq. Once the limit is reached, the new entries will not be queued.
This constraint is due to the limit on amount of data that can be stored in ZK. 
Have you thought of introducing a system table for recording information w.r.t. 
HFiles to be replicated ?

bq. During Scan there will not be any matching entry corresponding to “1” in 
Peer cluster Visibility Tables. 
Index for visibility table entry could be different in peer cluster. Should 
visibility labels be rewritten during the replication ?

bq. if again replicated from cluster-2 to active cluster, it will be accepted.
Could sequence Id be used so that the HFiles don't need to be written again ?


> enable bulkload to support replication
> --------------------------------------
>
>                 Key: HBASE-13153
>                 URL: https://issues.apache.org/jira/browse/HBASE-13153
>             Project: HBase
>          Issue Type: Bug
>          Components: API
>            Reporter: sunhaitao
>            Assignee: Ashish Singhi
>         Attachments: HBase Bulk Load Replication.pdf
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13153) enable bulkload to support replication

Reply via email to