[ https://issues.apache.org/jira/browse/HIVE-10022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392698#comment-15392698 ]
Sushanth Sowmyan commented on HIVE-10022: ----------------------------------------- >From the above test failures, there are 3 relevant failures: * org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform * org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_insert * org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_insert_local Of these, the first one, that of authorization_disallow_transform.q, is a good failure to have had, since it also demonstrates the base bug - the .q.out previously generated had a URI disallow error because it kept looked for a blank parent, and then recursed down that, rather than failing because transforms were disallowed. Thus, the fix for the first one is to regenerate the .q.out file. The remaining two issues are valid bugs in our current patch, where the parent-determination logic is incorrect in our current impl. The way we do this now is by looking for the filestatus of a dir, or a filestatus of the first parent that exists. Then, we compare that against the provided dir, and decide that if they're not identical, then we must be in the parent case. This is faulty logic, and we shouldn't be doing string compares if possible, especially since we already have FileUtils.getFileStatusOrNull that solves the same issue. I've cleaned up that logic to do a better job of determining if we're going down the parent case. The new logic is as follows: * get the fileStatus corresponding to this path from FileUtils.getFileStatusOrNull * If we got back null, then this does not exist, and thus, we're going down the parent-picking line. Otherwise, we can use it as-is. * Also, since our parent-picking utility function is via FileUtils.getPathOrParentThatExists , passing the direct filePath to it directly is a bit of a waste since we've already determined that it does not exist. So, I added in one tiny bit of optimization to prevent double-calling the first time, by calling FileUtils.getPathOrParentThatExists(fs, filePath.getParent()) rather than FileUtils.getPathOrParentThatExists(fs, filePath) , both of which will return an identical result in this case. See https://gist.github.com/khorgath/9eeec30b0035dfdc70ae24dab2dd9923 for a diff between the two patch states (b/w .7.patch and .8.patch) > Authorization checks for non existent file/directory should not be recursive > ---------------------------------------------------------------------------- > > Key: HIVE-10022 > URL: https://issues.apache.org/jira/browse/HIVE-10022 > Project: Hive > Issue Type: Bug > Components: Authorization > Affects Versions: 0.14.0 > Reporter: Pankit Thapar > Assignee: Sushanth Sowmyan > Attachments: HIVE-10022.2.patch, HIVE-10022.3.patch, > HIVE-10022.4.patch, HIVE-10022.5.patch, HIVE-10022.6.patch, > HIVE-10022.7.patch, HIVE-10022.patch > > > I am testing a query like : > set hive.test.authz.sstd.hs2.mode=true; > set > hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest; > set > hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateConfigUserAuthenticator; > set hive.security.authorization.enabled=true; > set user.name=user1; > create table auth_noupd(i int) clustered by (i) into 2 buckets stored as orc > location '${OUTPUT}' TBLPROPERTIES ('transactional'='true'); > Now, in the above query, since authorization is true, > we would end up calling doAuthorizationV2() which ultimately ends up calling > SQLAuthorizationUtils.getPrivilegesFromFS() which calls a recursive method : > FileUtils.isActionPermittedForFileHierarchy() with the object or the ancestor > of the object we are trying to authorize if the object does not exist. > The logic in FileUtils.isActionPermittedForFileHierarchy() is DFS. > Now assume, we have a path as a/b/c/d that we are trying to authorize. > In case, a/b/c/d does not exist, we would call > FileUtils.isActionPermittedForFileHierarchy() with say a/b/ assuming a/b/c > also does not exist. > If under the subtree at a/b, we have millions of files, then > FileUtils.isActionPermittedForFileHierarchy() is going to check file > permission on each of those objects. > I do not completely understand why do we have to check for file permissions > in all the objects in branch of the tree that we are not trying to read > from /write to. > We could have checked file permission on the ancestor that exists and if it > matches what we expect, the return true. > Please confirm if this is a bug so that I can submit a patch else let me know > what I am missing ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)