[ 
https://issues.apache.org/jira/browse/HBASE-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15397333#comment-15397333
 ] 

Phil Yang commented on HBASE-9465:
----------------------------------

I have not noticed the on-going work, may I know the issue number? If we have 
this feature, we should do more work when waiting to push the first section of 
a region. We should check if the parent region is fully pushed.

{quote}
Would the improvement be implemented in another JIRA ?
{quote}
Will do this after resolving this issue. It is not very easy because different 
tables have different progress of pushing, and if we still use one thread to 
read the logs, we have to use a queue for blocked regions and it can not grow 
without limit which will result in OOM.

> Push entries to peer clusters serially
> --------------------------------------
>
>                 Key: HBASE-9465
>                 URL: https://issues.apache.org/jira/browse/HBASE-9465
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver, Replication
>            Reporter: Honghua Feng
>            Assignee: Phil Yang
>         Attachments: HBASE-9465-v1.patch, HBASE-9465-v2.patch, 
> HBASE-9465-v2.patch, HBASE-9465.pdf
>
>
> When region-move or RS failure occurs in master cluster, the hlog entries 
> that are not pushed before region-move or RS-failure will be pushed by 
> original RS(for region move) or another RS which takes over the remained hlog 
> of dead RS(for RS failure), and the new entries for the same region(s) will 
> be pushed by the RS which now serves the region(s), but they push the hlog 
> entries of a same region concurrently without coordination.
> This treatment can possibly lead to data inconsistency between master and 
> peer clusters:
> 1. there are put and then delete written to master cluster
> 2. due to region-move / RS-failure, they are pushed by different 
> replication-source threads to peer cluster
> 3. if delete is pushed to peer cluster before put, and flush and 
> major-compact occurs in peer cluster before put is pushed to peer cluster, 
> the delete is collected and the put remains in peer cluster
> In this scenario, the put remains in peer cluster, but in master cluster the 
> put is masked by the delete, hence data inconsistency between master and peer 
> clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to