[ 
https://issues.apache.org/jira/browse/HBASE-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850588#action_12850588
 ] 

Andrew Purtell commented on HBASE-2357:
---------------------------------------

Writes would be blocked by the slowest of the clique but if this scheme is 
allowing (strongly consistent!) read load to be more spread out, then in theory 
anyway the probability of hot accesses to a particular region server starving 
the write side is lowered accordingly. We could mock it and see what happens 
and/or try to work through some of the particulars formally. Like Ryan I wonder 
how slow updates might get. Consider if we run ZAB on a 3-node clique and 
hflush in parallel to commit with a barrier on completion of both. Who wins the 
race? How often would hflush take longer? Could be a substantial percentage, 
especially in a mixed HBase and HDFS (plain mapreduce or Hive or Pig or 
Cascading or...) loaded environment. It's not clear that hflush would not 
dominate, is my point.

What I don't like about log shipping is the read replicas are not going to be 
useful to someone who is using HBase for its strong consistency and needs it, 
with exception for use cases where one could accept consistent results looking 
back from the timestamp of the last replication. (But that timestamp could be 
different on each slave, so master and slaves might all have different views!) 
But with a consensus protocol, read load can be spread as is the intent of this 
issue and yet the data is still strongly consistent. 

So I might humbly suggest that both ideas have pros and cons and neither 
warrants a -1 nor a +1 at this point, IMO. 

> Add read-only region replicas (slaves) for availability and fast region 
> recovery
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-2357
>                 URL: https://issues.apache.org/jira/browse/HBASE-2357
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: master, regionserver
>            Reporter: Todd Lipcon
>
> I dont plan on working on this in the short term, but the idea is to extend 
> region ownership to have two modes. Each region has one primary region server 
> and N slave region servers. The slaves would follow the master (probably by 
> streaming the relevant HLog entries directly from it) and be able to serve 
> stale reads. The benefit is twofold: (a) provides the ability to spread read 
> load, (b) enables very fast region failover/rebalance since the memstore is 
> already nearly up to date on the slave RS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to