[
https://issues.apache.org/jira/browse/HBASE-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Enis Soztutar resolved HBASE-11580.
-----------------------------------
Resolution: Fixed
> Failover handling for secondary region replicas
> -----------------------------------------------
>
> Key: HBASE-11580
> URL: https://issues.apache.org/jira/browse/HBASE-11580
> Project: HBase
> Issue Type: Sub-task
> Reporter: Enis Soztutar
> Assignee: Enis Soztutar
> Fix For: 2.0.0, 1.1.0
>
> Attachments: hbase-11580-addendum.patch, hbase-11580_v2.patch,
> hbase-11580_v3.patch
>
>
> With the async wal approach (HBASE-11568), the edits are not persisted (to
> wal) in the secondary region replicas. However this means that we have to
> deal with secondary region replica failures.
> We can seek to re-replicate the edits from primary to the secondary when the
> secondary region is opened in another server but this would mean to setup a
> replication queue again, and holding on to the wals for longer.
> Instead, we can design it so that the edits form the secondaries are not
> persisted to wal, and if the secondary replica fails over, it will not start
> serving reads until it has guaranteed that it has all the past data.
> For guaranteeing that the secondary replica has all the edits before serving
> reads, we can use flush and region opening markers. Whenever a region open
> event is seen, it writes all the files at the time of opening to wal
> (HBASE-11512). In case of flush, the flushed file is written as well, and the
> secondary replica can do a ls for the store files and pick up all the files
> before the seqId of the flushed file. So, in this design, the secodary
> replica will wait until it sees and replays a flush or region open marker
> from wal from primary. and then start serving. For speeding up replica
> opening time, we can trigger a flush to the primary whenever the secondary
> replica opens as an optimization.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)