[
https://issues.apache.org/jira/browse/HBASE-26878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Kyle Purtell resolved HBASE-26878.
-----------------------------------------
Fix Version/s: 2.5.0
2.6.0
3.0.0-alpha-3
2.4.12
Hadoop Flags: Reviewed
Resolution: Fixed
> TableInputFormatBase should cache RegionSizeCalculator
> ------------------------------------------------------
>
> Key: HBASE-26878
> URL: https://issues.apache.org/jira/browse/HBASE-26878
> Project: HBase
> Issue Type: Improvement
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Minor
> Fix For: 2.5.0, 2.6.0, 3.0.0-alpha-3, 2.4.12
>
>
> TableInputFormatBase's getSplits() method instantiates a new
> RegionSizeCalculator every time. Instantiating a RegionSizeCalculator
> involves scanning for all regionlocations for a given table in meta. This can
> be costly for large tables, and we don't know how often a subclass will call
> getSplits().
> When initializeTable is called, we already cache the RegionLocator and Admin
> that are used for passing into the RegionSizeCalculator. We should similarly
> cache the RegionSizeCalculator itself at that same time to avoid unnecessary
> meta scans on repeat getSplits() calls.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)