[ 
https://issues.apache.org/jira/browse/HBASE-15648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259797#comment-15259797
 ] 

Mikhail Antonov commented on HBASE-15648:
-----------------------------------------

[~stack]  Yeah, current implementation doesn't work.

So idea here is that right now we have client-side thread pool of 128 threads 
by default dedicated to meta lookups. in ConnectonManager #locateRegionInMeta 
we don't really have any locking to prevent multiple threads from looking up 
location for the same region, nor is there any rate limiting on the number of 
meta lookups client may issue to meta (this all is based on the idea that on 
large cluster when load is spiky, lookups to meta table could easily become a 
bottleneck, so that needs to be tightened up).

Unfortunately looks like we can't really lock by the region, since they can 
split/merge. May instead try to 

 - use rate limited thread pool for meta lookups on client side (newer versions 
of Guava have nice implementation) to cap total number of lookups happening per 
second.
 - reduce number of threads on the client side dedicated to meta lookups (not 
really complete solution, but still... 128 just looks too much)
 - use throttling for meta requests on server side

Let me bump it to 1.4. Not really critical, especially as HBASE-15658 is fixed.

> Reduce number of concurrent region location lookups when MetaCache entry is 
> cleared
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-15648
>                 URL: https://issues.apache.org/jira/browse/HBASE-15648
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>    Affects Versions: 1.3.0
>            Reporter: Mikhail Antonov
>            Assignee: Mikhail Antonov
>         Attachments: HBASE-15648-branch-1.3.v1.patch
>
>
> It seems in HConnectionImplementation#locateRegionInMeta if region location 
> is removed from the cache, with large number of client threads we could have 
> many of them getting cache miss and doing meta scan, which looks unnecessary 
> - we could empty mechanism similar to what we have in IdLock in HFileReader 
> to fetch the block to cache, do ensure that if one thread is already looking 
> up location for region R1, other threads who need it's location wait until 
> first thread finishes his work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to