[ https://issues.apache.org/jira/browse/RANGER-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Fateh Singh updated RANGER-4761: -------------------------------- Summary: Reduce memory footprint of hbase plugin (was: Revisit ColumnFamilyCache to reduce memory footprint of hbase plugin) > Reduce memory footprint of hbase plugin > --------------------------------------- > > Key: RANGER-4761 > URL: https://issues.apache.org/jira/browse/RANGER-4761 > Project: Ranger > Issue Type: Improvement > Components: Ranger > Reporter: Fateh Singh > Assignee: Fateh Singh > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > * Map<String, Set<String>> getColumnFamilies(Map<byte[], ? extends > Collection<?>> families) was the original bottleneck for resulting in Ranger > Authz CP taking 60% of Put RPC time — it is a computationally heavy function > converting bytes to string and type-casting Collection to set of strings. > * ColumnFamilyCache was introduced to fix this issue. But this caching does > not work because the columns and column families accessed for a table are > inconsistent and results in many many *cache misses* *and thus results in > getColumnFamilies()* getting called – so additional memory requirement for > cache (grows exponentially due to large number of columns) and also double > computation (serialization of new family map to check in cache and also > getColumnFamilies is computed). E.g. First call comes for cf1:c1,c3,c4 Second > call comes for cf1:c1,c2,c5 then the second call is a cache miss because the > set of columns accessed is different from the first call. Both these entries > get added to the cache. > * Validation: Added debug statements for cache hit and miss counts. word > count in log file for Cache Miss and Cache Hit, Cache size reaches max > default size of 1024 > {code:java} > [root@ccycloud-2 hbase]# grep 'Cache Miss' > hbase-cmf-HBASE-1-REGIONSERVER-ccycloud-2.fs2-7192.root.comops.site.log.out | > wc -l > 10584 > [root@ccycloud-2 hbase]# grep 'Cache Hit' > hbase-cmf-HBASE-1-REGIONSERVER-ccycloud-2.fs2-7192.root.comops.site.log.out | > wc -l > 0 > {code} > {code:java} > 2024-02-09 17:12:31,093 DEBUG > org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor: > evaluateAccess: Cache Size:1024 > 2024-02-09 17:12:31,096 DEBUG > org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor: > evaluateAccess: Cache Size:1024 > {code} > Another issue: > Key computation for cache has a bug wherein address of byte array is being > used as key in hashmap instead of value which results in cache miss always > irrespective of what the request is > {code:java} > 2024-02-09 18:12:12,310 DEBUG > org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor: > evaluateAccess: Cache Miss for user[atlas] for key: > atlas_janus:[[B@1b59403b=null]{code} > > The implementation needs to be revisited to reduce memory footprint -- This message was sent by Atlassian Jira (v8.20.10#820010)