[ https://issues.apache.org/jira/browse/IMPALA-9242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010913#comment-17010913 ]
Vihang Karajgaonkar commented on IMPALA-9242: --------------------------------------------- Thanks [~csringhofer] for taking this up. I have a couple of questions on your comment above: {quote}1. Create a separate cache with efficient look up by authorizable, possible by simply reusing TreePrivilegeCache from Sentry. This cache would need to be recreated whenever the user's/role's privileges change. This seems an easy solution + it would not affect Ranger, but would likely ~double the memory consumption of privileges.{quote} I agree that {{TreePrivilegeCache}} is not generic enough to be used directly in Impala. I had raised these concerns in the review (specifically to support add/remove privileges to the cache so that this can be long-living cache) but unfortunately, it couldn't be implemented that way. I that SENTRY-2539 is designed specifically with Hive in mind since it loads all the user privileges in the beginning everytime before doing the privilege check which I think is not very efficient in the first place. However, I did not understand why using {{TreePrivilegeCache}} directly would double the memory? Is it because you will still need to store them separately for each individual user/role? May be we can have a hierarchical cache instead of {{CatalogObjectCache}} in {{Principal}} class here https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/Principal.java#L43. If {{TreePrivilegeCache}} cannot be used directory, may be we can extend it? I am not sure if its worth extending it since that way we have better control on the cache implementation. {quote} 2. Change AuthorizationPolicy ( https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java ) to store Privileges in a "more treelike manner" to allow efficient look up by authorizable. This is a more complex change and could affect Ranger too. {quote} I am not very familiar with Ranger implementation. Do we cache privileges for Ranger too? > Access check should only check against the privileges of the authorizable > ------------------------------------------------------------------------- > > Key: IMPALA-9242 > URL: https://issues.apache.org/jira/browse/IMPALA-9242 > Project: IMPALA > Issue Type: Improvement > Reporter: Vihang Karajgaonkar > Assignee: Csaba Ringhofer > Priority: Major > > Currently, according to the implementation of > https://github.com/apache/sentry/blob/branch-2.1.0/sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java#L64 > each access check request in Sentry is done against all the privileges of the > user. Instead, we can reduce the number of privilege checks significantly, if > we use this API in > https://github.com/apache/sentry/blob/master/sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/PrivilegeCache.java#L46 > Unfortunately, SENTRY-1291 which is merged in master branch of Sentry is > unavailable. However, if we can have a interface side changes in > PrivilegeCache, Impala can implement a prefix-tree based {{PrivilegeCache}} > so that number of privileges returned are only related to the given > authorizable. This API can then be used in SimpleCacheProviderBackend to > reduce the processing time required to check access for a large number of > objects in large setups. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org