[ 
https://issues.apache.org/jira/browse/HIVE-10022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392698#comment-15392698
 ] 

Sushanth Sowmyan commented on HIVE-10022:
-----------------------------------------

>From the above test failures, there are 3 relevant failures:

 * 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform
 * 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_insert
 * 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_insert_local

Of these, the first one, that of authorization_disallow_transform.q, is a good 
failure to have had, since it also demonstrates the base bug - the .q.out 
previously generated had a URI disallow error because it kept looked for a 
blank parent, and then recursed down that, rather than failing because 
transforms were disallowed. Thus, the fix for the first one is to regenerate 
the .q.out file.

The remaining two issues are valid bugs in our current patch, where the 
parent-determination logic is incorrect in our current impl. The way we do this 
now is by looking for the filestatus of a dir, or a filestatus of the first 
parent that exists. Then, we compare that against the provided dir, and decide 
that if they're not identical, then we must be in the parent case. This is 
faulty logic, and we shouldn't be doing string compares if possible, especially 
since we already have FileUtils.getFileStatusOrNull that solves the same issue. 
I've cleaned up that logic to do a better job of determining if we're going 
down the parent case. The new logic is as follows:

 * get the fileStatus corresponding to this path from 
FileUtils.getFileStatusOrNull
 * If we got back null, then this does not exist, and thus, we're going down 
the parent-picking line. Otherwise, we can use it as-is.
 * Also, since our parent-picking utility function is via 
FileUtils.getPathOrParentThatExists , passing the direct filePath to it 
directly is a bit of a waste since we've already determined that it does not 
exist. So, I added in one tiny bit of optimization to prevent double-calling 
the first time, by calling FileUtils.getPathOrParentThatExists(fs, 
filePath.getParent()) rather than FileUtils.getPathOrParentThatExists(fs, 
filePath) , both of which will return an identical result in this case.

See https://gist.github.com/khorgath/9eeec30b0035dfdc70ae24dab2dd9923 for a 
diff between the two patch states (b/w .7.patch and .8.patch)

> Authorization checks for non existent file/directory should not be recursive
> ----------------------------------------------------------------------------
>
>                 Key: HIVE-10022
>                 URL: https://issues.apache.org/jira/browse/HIVE-10022
>             Project: Hive
>          Issue Type: Bug
>          Components: Authorization
>    Affects Versions: 0.14.0
>            Reporter: Pankit Thapar
>            Assignee: Sushanth Sowmyan
>         Attachments: HIVE-10022.2.patch, HIVE-10022.3.patch, 
> HIVE-10022.4.patch, HIVE-10022.5.patch, HIVE-10022.6.patch, 
> HIVE-10022.7.patch, HIVE-10022.patch
>
>
> I am testing a query like : 
> set hive.test.authz.sstd.hs2.mode=true;
> set 
> hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest;
> set 
> hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateConfigUserAuthenticator;
> set hive.security.authorization.enabled=true;
> set user.name=user1;
> create table auth_noupd(i int) clustered by (i) into 2 buckets stored as orc 
> location '${OUTPUT}' TBLPROPERTIES ('transactional'='true');
> Now, in the above query,  since authorization is true, 
> we would end up calling doAuthorizationV2() which ultimately ends up calling 
> SQLAuthorizationUtils.getPrivilegesFromFS() which calls a recursive method : 
> FileUtils.isActionPermittedForFileHierarchy() with the object or the ancestor 
> of the object we are trying to authorize if the object does not exist. 
> The logic in FileUtils.isActionPermittedForFileHierarchy() is DFS.
> Now assume, we have a path as a/b/c/d that we are trying to authorize.
> In case, a/b/c/d does not exist, we would call 
> FileUtils.isActionPermittedForFileHierarchy() with say a/b/ assuming a/b/c 
> also does not exist.
> If under the subtree at a/b, we have millions of files, then 
> FileUtils.isActionPermittedForFileHierarchy()  is going to check file 
> permission on each of those objects. 
> I do not completely understand why do we have to check for file permissions 
> in all the objects in  branch of the tree that we are not  trying to read 
> from /write to.  
> We could have checked file permission on the ancestor that exists and if it 
> matches what we expect, the return true.
> Please confirm if this is a bug so that I can submit a patch else let me know 
> what I am missing ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to