[jira] [Commented] (HBASE-10070) HBase read high-availability using eventually consistent region replicas

Enis Soztutar (JIRA) Mon, 02 Dec 2013 19:00:11 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837266#comment-13837266
 ]


Enis Soztutar commented on HBASE-10070:
---------------------------------------

bq. Enis Soztutar in your doc does "group-based assignment" mean assigning 
multiple regions on a single transaction?
I was trying to refer to not having co-location constraints for secondary 
replicas whose primaries are hosted by the same RS. For example, if 
R1(replica=0), and R2(replica=0) are hosted on RS1, R1(replica=1) and 
R2(replica=1) can be hosted by RS2 and RS3 respectively. This can definitely 
use the hdfs block affinity work though. 

bq. However, I don't think we get into a situation where all RS's must read all 
other RS's logs – we only need to have the shadows RS's to read the primary 
RS's log.
I am assuming a random distribution of secondary regions per above. In this 
case, for replication=2, a region server will have half of it's regions in 
primary and the other in secondary mode. For all the regions in the secondary 
mode, it has to tail the logs of the rs where the primary is hosted. However, 
since there is no co-location guarantee, the primaries are also randomly 
distributed. For n secondary regions, and m region servers, you will have to 
tail the logs of most of the RSs if n > m with a high probability (I do not 
have the smarts to calculate the exact probability) 

> HBase read high-availability using eventually consistent region replicas
> ------------------------------------------------------------------------
>
>                 Key: HBASE-10070
>                 URL: https://issues.apache.org/jira/browse/HBASE-10070
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>         Attachments: HighAvailabilityDesignforreadsApachedoc.pdf
>
>
> In the present HBase architecture, it is hard, probably impossible, to 
> satisfy constraints like 99th percentile of the reads will be served under 10 
> ms. One of the major factors that affects this is the MTTR for regions. There 
> are three phases in the MTTR process - detection, assignment, and recovery. 
> Of these, the detection is usually the longest and is presently in the order 
> of 20-30 seconds. During this time, the clients would not be able to read the 
> region data.
> However, some clients will be better served if regions will be available for 
> reads during recovery for doing eventually consistent reads. This will help 
> with satisfying low latency guarantees for some class of applications which 
> can work with stale reads.
> For improving read availability, we propose a replicated read-only region 
> serving design, also referred as secondary regions, or region shadows. 
> Extending current model of a region being opened for reads and writes in a 
> single region server, the region will be also opened for reading in region 
> servers. The region server which hosts the region for reads and writes (as in 
> current case) will be declared as PRIMARY, while 0 or more region servers 
> might be hosting the region as SECONDARY. There may be more than one 
> secondary (replica count > 2).
> Will attach a design doc shortly which contains most of the details and some 
> thoughts about development approaches. Reviews are more than welcome. 
> We also have a proof of concept patch, which includes the master and regions 
> server side of changes. Client side changes will be coming soon as well. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10070) HBase read high-availability using eventually consistent region replicas

Reply via email to