[jira] [Commented] (HBASE-8338) Latency Resilience; umbrella list of issues that will help us ride over bad disk, bad region, ec2, etc.

binlijin (JIRA) Fri, 12 Apr 2013 19:14:18 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630866#comment-13630866
 ]


binlijin commented on HBASE-8338:
---------------------------------

I think hdfs quorum reads is important for read，facebook already implement this.
bq. http://research.google.com/people/jeff/latency.html

{code}
http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java?view=log
[89-fb] [HBASE-7509] Regionserver support to control quorum reads

Author: aaiyer

Summary:
It will be good to have the ability to control the
paramaters for quorum reads while the regionserver is still running.

We want to control:
  1) the timeout that we wait for, before initiating the second read.
  2) number of threads allocated for quorum reads
     - setting this to 0 will disable quorum reads

 Depends on the quorum diff in HDFS to add the DFSClient calls.
https://phabricator.fb.com/D615354
{code}
                
> Latency Resilience; umbrella list of issues that will help us ride over bad 
> disk, bad region, ec2, etc.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-8338
>                 URL: https://issues.apache.org/jira/browse/HBASE-8338
>             Project: HBase
>          Issue Type: Umbrella
>          Components: LatencyResilience
>            Reporter: stack
>            Priority: Critical
>
> Chatting w/ Elliott, we started listing out items to fix that would help keep 
> hbase latency approximately constant as disks went bad, were saturated by a 
> neighbour (ec2), etc.
> I must made a new LatencyResilience issue category to tag issues that 
> contribute to this project.
> I have to go at moment but when I get back I'll start to link in existing 
> issues that help this project along and I'll file new ones.
> Here is what we chatted about:
> + Multiple WALs effort will help keep write latency roughly constant.
> + Figuring how to get a new read started over dfsclient if current replica 
> read is taking too long would help keep reads about constant (maybe could 
> exploit the nkeywal hackery messing w/ replicas order).
> + There is an issue where client can currently pile up on a single region 
> because of the way we do client queues by regionserver.  This needs fixing.
> The above are few ideas worth further exploration at least.
> Idea is to try and bring down our 95percentiles and to make us more robust in 
> the face of dying disks, etc.  I see this issue rising to the fore now there 
> has been good progress on the MTTR project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8338) Latency Resilience; umbrella list of issues that will help us ride over bad disk, bad region, ec2, etc.

Reply via email to