[
https://issues.apache.org/jira/browse/SENTRY-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958503#comment-14958503
]
Colin Ma commented on SENTRY-565:
---------------------------------
[~lskuff], I think the current patch is not good. It doesn't resolve the
privilege cache problem and make the code structure confusion.
I want to resolve this problem as follow:
1. Add privileges cache in SimpleDBPolicyEngine as Map<String,
set<CachedPrivileges>>, the CachedPrivileges will include privileges and call
time by backend.
2. When call the SimpleDBPolicyEngine.getPrivileges, check the cache, if not
exist or expire, get the privilege by backend. The expired time can be
configured and default is 5 mins.
Do you think it's reasonable for this problem?
> Improvement the performance when Sentry filter the entity
> ---------------------------------------------------------
>
> Key: SENTRY-565
> URL: https://issues.apache.org/jira/browse/SENTRY-565
> Project: Sentry
> Issue Type: Improvement
> Reporter: Colin Ma
> Assignee: Colin Ma
> Attachments: SENTRY-565.001.patch, SENTRY-565.002.patch
>
>
> Currently, when get the metadata from hive, eg, "show tables", "show
> databases". Sentry will filter the result and output the authorized entities.
> There will be many RPC calls when filtering the result. The related code is
> in HiveAuthzBinding, for example, in filterShowTables:
> {code}
> ......
> for (String tableName : queryResult) {
> ......
> hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject,
> inputHierarchy,
> outputHierarchy, providedPrivileges);
> ......
> }
> ......
> {code}
> hiveAuthzBinding.authorize will get the privileges from sentry service, if
> there are many tables in the hive, the filtering process will spend much
> time. Considering sentry also need to filter the column, HiveAuthzBinding
> should be improved to reduce the number of rpc calls when doing the filter.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)