[
https://issues.apache.org/jira/browse/RANGER-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fateh Singh updated RANGER-4761:
--------------------------------
Summary: Reduce memory footprint of hbase plugin (was: Revisit
ColumnFamilyCache to reduce memory footprint of hbase plugin)
> Reduce memory footprint of hbase plugin
> ---------------------------------------
>
> Key: RANGER-4761
> URL: https://issues.apache.org/jira/browse/RANGER-4761
> Project: Ranger
> Issue Type: Improvement
> Components: Ranger
> Reporter: Fateh Singh
> Assignee: Fateh Singh
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> * Map<String, Set<String>> getColumnFamilies(Map<byte[], ? extends
> Collection<?>> families) was the original bottleneck for resulting in Ranger
> Authz CP taking 60% of Put RPC time — it is a computationally heavy function
> converting bytes to string and type-casting Collection to set of strings.
> * ColumnFamilyCache was introduced to fix this issue. But this caching does
> not work because the columns and column families accessed for a table are
> inconsistent and results in many many *cache misses* *and thus results in
> getColumnFamilies()* getting called – so additional memory requirement for
> cache (grows exponentially due to large number of columns) and also double
> computation (serialization of new family map to check in cache and also
> getColumnFamilies is computed). E.g. First call comes for cf1:c1,c3,c4 Second
> call comes for cf1:c1,c2,c5 then the second call is a cache miss because the
> set of columns accessed is different from the first call. Both these entries
> get added to the cache.
> * Validation: Added debug statements for cache hit and miss counts. word
> count in log file for Cache Miss and Cache Hit, Cache size reaches max
> default size of 1024
> {code:java}
> [root@ccycloud-2 hbase]# grep 'Cache Miss'
> hbase-cmf-HBASE-1-REGIONSERVER-ccycloud-2.fs2-7192.root.comops.site.log.out |
> wc -l
> 10584
> [root@ccycloud-2 hbase]# grep 'Cache Hit'
> hbase-cmf-HBASE-1-REGIONSERVER-ccycloud-2.fs2-7192.root.comops.site.log.out |
> wc -l
> 0
> {code}
> {code:java}
> 2024-02-09 17:12:31,093 DEBUG
> org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor:
> evaluateAccess: Cache Size:1024
> 2024-02-09 17:12:31,096 DEBUG
> org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor:
> evaluateAccess: Cache Size:1024
> {code}
> Another issue:
> Key computation for cache has a bug wherein address of byte array is being
> used as key in hashmap instead of value which results in cache miss always
> irrespective of what the request is
> {code:java}
> 2024-02-09 18:12:12,310 DEBUG
> org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor:
> evaluateAccess: Cache Miss for user[atlas] for key:
> atlas_janus:[[B@1b59403b=null]{code}
>
> The implementation needs to be revisited to reduce memory footprint
--
This message was sent by Atlassian Jira
(v8.20.10#820010)