[ https://issues.apache.org/jira/browse/HBASE-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630866#comment-13630866 ]
binlijin commented on HBASE-8338: --------------------------------- I think hdfs quorum reads is important for read,facebook already implement this. bq. http://research.google.com/people/jeff/latency.html {code} http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java?view=log [89-fb] [HBASE-7509] Regionserver support to control quorum reads Author: aaiyer Summary: It will be good to have the ability to control the paramaters for quorum reads while the regionserver is still running. We want to control: 1) the timeout that we wait for, before initiating the second read. 2) number of threads allocated for quorum reads - setting this to 0 will disable quorum reads Depends on the quorum diff in HDFS to add the DFSClient calls. https://phabricator.fb.com/D615354 {code} > Latency Resilience; umbrella list of issues that will help us ride over bad > disk, bad region, ec2, etc. > ------------------------------------------------------------------------------------------------------- > > Key: HBASE-8338 > URL: https://issues.apache.org/jira/browse/HBASE-8338 > Project: HBase > Issue Type: Umbrella > Components: LatencyResilience > Reporter: stack > Priority: Critical > > Chatting w/ Elliott, we started listing out items to fix that would help keep > hbase latency approximately constant as disks went bad, were saturated by a > neighbour (ec2), etc. > I must made a new LatencyResilience issue category to tag issues that > contribute to this project. > I have to go at moment but when I get back I'll start to link in existing > issues that help this project along and I'll file new ones. > Here is what we chatted about: > + Multiple WALs effort will help keep write latency roughly constant. > + Figuring how to get a new read started over dfsclient if current replica > read is taking too long would help keep reads about constant (maybe could > exploit the nkeywal hackery messing w/ replicas order). > + There is an issue where client can currently pile up on a single region > because of the way we do client queues by regionserver. This needs fixing. > The above are few ideas worth further exploration at least. > Idea is to try and bring down our 95percentiles and to make us more robust in > the face of dying disks, etc. I see this issue rising to the fore now there > has been good progress on the MTTR project. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira