Bryan Beaudreault created HBASE-26878:
-----------------------------------------
Summary: TableInputFormatBase should cache RegionSizeCalculator
Key: HBASE-26878
URL: https://issues.apache.org/jira/browse/HBASE-26878
Project: HBase
Issue Type: Improvement
Reporter: Bryan Beaudreault
Assignee: Bryan Beaudreault
TableInputFormatBase's getSplits() method instantiates a new
RegionSizeCalculator every time. Instantiating a RegionSizeCalculator involves
scanning for all regionlocations for a given table in meta. This can be costly
for large tables, and we don't know how often a subclass will call getSplits().
When initializeTable is called, we already cache the RegionLocator and Admin
that are used for passing into the RegionSizeCalculator. We should similarly
cache the RegionSizeCalculator itself at that same time to avoid unnecessary
meta scans on repeat getSplits() calls.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)