[ 
https://issues.apache.org/jira/browse/HADOOP-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559759#action_12559759
 ] 

Bryan Duxbury commented on HADOOP-2443:
---------------------------------------

I'm not sure I understand your logic. if I replaced line 231 with a continue, 
it would just continue the for loop that's iterating over the columns of the 
current result we pulled back. The reason that we break all the way out of the 
scanner iteration loop is because it is impossible for this scanner to ever 
yield another HRegionInfo that might match the current table. Continuing would 
only waste time traversing the rest of the scanner.

This did, however, bring to my attention that if there are multiple .META. 
regions, and the table we're searching for is in the first .META. region, this 
code will actually search at least the first row of the second region too. So 
in reality, the break SCANNER_LOOP should be break REGION_LOOP. Does that make 
sense?

> [hbase] Keep lazy cache of regions in client rather than an 'authoritative' 
> list
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-2443
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2443
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: Bryan Duxbury
>             Fix For: 0.16.0
>
>         Attachments: 2443-v10.patch, 2443-v3.patch, 2443-v4.patch, 
> 2443-v5.patch, 2443-v6.patch, 2443-v7.patch, 2443-v8.patch, 2443-v9.patch
>
>
> Currently, when the client gets a NotServingRegionException -- usually 
> because its in middle of being split or there has been a regionserver crash 
> and region is being moved elsewhere -- the client does a complete refresh of 
> its cache of region locations for a table.
> Chatting with Jim about a Paul Saab upload issue from Saturday night, when 
> tables are big comprised of regions that are splitting fast (because of bulk 
> upload), its unlikely a client will ever be able to obtain a stable list of 
> all region locations.  Given that any update or scan requires that the list 
> of all regions be in place before it proceeds, this can get in the way of the 
> client succeeding when the cluster is under load.
> Chatting, we figure that it better the client holds a lazy region cache: on 
> NSRE, figure out where that region has gone only and update the client-side 
> cache for that entry only rather than throw out all we know of a table every 
> time.
> Hopefully this will fix the issue PS was experiencing where during intense 
> upload, he was unable to get/scan/hql the same table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to