[ https://issues.apache.org/jira/browse/HBASE-25624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Kyle Purtell reassigned HBASE-25624: ------------------------------------------- Assignee: (was: Prathyusha) > Bound LoadBalancer's RegionLocationFinder cache > ----------------------------------------------- > > Key: HBASE-25624 > URL: https://issues.apache.org/jira/browse/HBASE-25624 > Project: HBase > Issue Type: Bug > Components: Balancer, master, Operability > Affects Versions: 1.6.0, 2.4.1 > Reporter: Andrew Kyle Purtell > Priority: Major > Fix For: 2.5.0, 1.8.0, 3.0.0-alpha-2 > > > We have a large table in production that causes the balancer's > RegionLocationFinder cache to consume 4 GB of heap, which, among other > factors, triggered OOMEs, and made us aware of this problem. > RegionLocationFinder embeds a cache backed by Guava's CacheLoader. The > RegionLocationFinder cache comes to consume heap for RegionInfos for all > table regions and all HDFS block locations of all store files for all regions > of all tables. > The only limit we pass to the CacheBuilder is an expiration time of 14400000 > milliseconds for individual cache entries. That's 4 hours. That's much too > long; however, the cache also periodically refreshes itself, where the need > for a refresh is checked whenever BaseLoadBalancer calls > RegionLocationFinder's setClusterMetrics() method, which defeats the > expiration based limit anyway. > We should be bounding this cache with effective resource controls. Time based > expiry is fine but the periodic refresh logic must be removed to make it > effective. Implement size based limits too. CacheBuilder#maximumSize will > limit by number cache entries. This might be fine but > CacheBuilder#maximumWeight would be better, where weight is something > determined by the API user. In this case it can be an estimate of the heap > size of the hash map entries kept in the cache. > Default should remain unbounded. Specific bounds should be supported by new > site configuration options. -- This message was sent by Atlassian Jira (v8.20.1#820001)