[
https://issues.apache.org/jira/browse/HBASE-9465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407234#comment-15407234
]
Phil Yang commented on HBASE-9465:
----------------------------------
There are two new families whose row keys are both encodedRegionName(because
when we read the WAL, we only know the encoded name)
rep_barrier saves every open sequence ids in a region, encoded from Long to
Bytes in qualifiers and value. So the values may look like : [1:1 100:100
135:135]
rep_position saves the position of pushed logs for each peer. So the values
look like: [1:135 2:100 4:10]
rep_position has also two special qualifiers, _TABLENAME_ and _DAUGHTER_ to
save its table name and daughter region names. They will be used by the meta
cleaner
> Push entries to peer clusters serially
> --------------------------------------
>
> Key: HBASE-9465
> URL: https://issues.apache.org/jira/browse/HBASE-9465
> Project: HBase
> Issue Type: New Feature
> Components: regionserver, Replication
> Reporter: Honghua Feng
> Assignee: Phil Yang
> Attachments: HBASE-9465-branch-1-v1.patch,
> HBASE-9465-branch-1-v1.patch, HBASE-9465-branch-1-v2.patch,
> HBASE-9465-v1.patch, HBASE-9465-v2.patch, HBASE-9465-v2.patch,
> HBASE-9465-v3.patch, HBASE-9465-v4.patch, HBASE-9465-v5.patch, HBASE-9465.pdf
>
>
> When region-move or RS failure occurs in master cluster, the hlog entries
> that are not pushed before region-move or RS-failure will be pushed by
> original RS(for region move) or another RS which takes over the remained hlog
> of dead RS(for RS failure), and the new entries for the same region(s) will
> be pushed by the RS which now serves the region(s), but they push the hlog
> entries of a same region concurrently without coordination.
> This treatment can possibly lead to data inconsistency between master and
> peer clusters:
> 1. there are put and then delete written to master cluster
> 2. due to region-move / RS-failure, they are pushed by different
> replication-source threads to peer cluster
> 3. if delete is pushed to peer cluster before put, and flush and
> major-compact occurs in peer cluster before put is pushed to peer cluster,
> the delete is collected and the put remains in peer cluster
> In this scenario, the put remains in peer cluster, but in master cluster the
> put is masked by the delete, hence data inconsistency between master and peer
> clusters
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)