Bryan Beaudreault created HBASE-26878:
-----------------------------------------

             Summary: TableInputFormatBase should cache RegionSizeCalculator
                 Key: HBASE-26878
                 URL: https://issues.apache.org/jira/browse/HBASE-26878
             Project: HBase
          Issue Type: Improvement
            Reporter: Bryan Beaudreault
            Assignee: Bryan Beaudreault


TableInputFormatBase's getSplits() method instantiates a new 
RegionSizeCalculator every time. Instantiating a RegionSizeCalculator involves 
scanning for all regionlocations for a given table in meta. This can be costly 
for large tables, and we don't know how often a subclass will call getSplits().

When initializeTable is called, we already cache the RegionLocator and Admin 
that are used for passing into the RegionSizeCalculator. We should similarly 
cache the RegionSizeCalculator itself at that same time to avoid unnecessary 
meta scans on repeat getSplits() calls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to