[ 
https://issues.apache.org/jira/browse/HBASE-30161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HBASE-30161:
-----------------------------------
    Labels: pull-request-available  (was: )

> Add paginated, single-RPC RegionLocator.getRegionLocations(startKey, limit) 
> API for bulk meta-cache warmup
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-30161
>                 URL: https://issues.apache.org/jira/browse/HBASE-30161
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Sanjeet Malhotra
>            Assignee: Sanjeet Malhotra
>            Priority: Major
>              Labels: pull-request-available
>
> `RegionLocator.getAllRegionLocations()` is currently the only bulk API        
>                
>   to fetch all region locations of a table. Internally it opens a
>   `ResultScanner` against `hbase:meta` via                                    
>                  
>   `MetaTableAccessor.scanMetaForTableRegions(...)` and drives                 
>                  
>   `scanner.next()` in a loop — so the number of RPCs is                       
>                  
>   `ceil(numRegions / hbase.meta.scanner.caching)`.                            
>                  
>                                                                               
>                  
>   This is a problem for clients (e.g. Phoenix) that want to perform a         
>                  
>   *bulk warmup* of their region-location cache after a fresh JVM start        
>                  
>   while serializing meta access. The natural pattern is to wrap the call      
>                  
>   in a lock — mirroring what `ConnectionImplementation.locateRegionInMeta`    
>                  
>   already does for single-region lookups via `userRegionLock`. But            
>                  
>   because `getAllRegionLocations()` does N RPCs under one logical call:       
>                  
>                                                                               
>                  
>   * The lock-timeout budget has to grow with table size — there is no         
>                  
>     sensible fixed value that works for both 10-region and 10000-region       
>                  
>     tables.                                                                   
>                  
>   * A single slow RPC inside the loop blocks all other meta lookups for
>     the duration.                                                             
>                  
>   * Per-call duration is no longer constant w.r.t. data volume.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to