Abhay Kulkarni created RANGER-4745:
--------------------------------------

             Summary: Enhance handling of subAccess authorization in Ranger 
HDFS plugin
                 Key: RANGER-4745
                 URL: https://issues.apache.org/jira/browse/RANGER-4745
             Project: Ranger
          Issue Type: Improvement
          Components: Ranger
            Reporter: Abhay Kulkarni


Currently Ranger performs authorization of the HDFS commands which require 
access to the hierarchy of files/directory rooted at the argument passed to the 
HDFS command as described below. Some examples of such commands are :

 

 
{quote}hdfs dfs -count -q -h -v <directory>; hdfs dfs -R <directory>
{quote}
HDFS Authorization Interface

When these commands are invoked, HDFS Namenode builds a tree of i-nodes 
corresponding to <directory>, and passes it to the authorizer with a flag 
indicating that subAccess (access to the directory hierarchy rooted at 
<directory>) is to be checked.

Ranger implementation

For each directory in the hierarchy rooted at <directory>, Ranger code checks 
if the requested permissions (typically read and execute) are allowed using 
only Ranger policies. If any directory in the top-down path starting from 
<directory> does not allow access, then the authorization steps done until then 
are discarded, and the HDFS default authorizer is called upon to check the 
access with the same arguments. The default authorizer only checks the HDFS 
ACLs (and not any Ranger policies) on each directory in the hierarchy to 
determine the access.

Design of new Ranger implementation

For each directory in the hierarchy rooted at <directory>, new Ranger design 
1. Checks if the requested permissions are allowed using only Ranger policies
2. If the access is denied, the authorization steps done until this point are 
discarded, and the HDFS default authorizer is called upon to check the access 
with the original set of argument, and the result of default authorizer is 
returned to Namenode.
3. Otherwise, if the access is not determined, a new set of arguments are 
constructed for the directory being processed and HDFS default authorizer is 
called to check the access with the modified set of arguments.
4. If the default authorizer does not allow the access, then the result is 
returned to Namenode.
5. Otherwise, the processing continues with the next directory.

Performance considerations

The new implementation may have some impact on the performance. A few cases are 
as follows.
1. There is a Ranger policy that allows requested permissions recursively to 
some directory in the hierarchy. Depending on how deep this directory is in the 
hierarchy, the number of directories for which the access evaluation is 
requested will change. Higher this directory in the hierarchy, lesser the 
number of evaluations. In the existing implementation, a short-circuiting of 
calls for evaluating Ranger policies will, in general, happen earlier, and the 
default authorizer will be called upon the handle the authorization.
2. In the worst case, if there is no Ranger policy for any directory in the 
hierarchy, then each directory in the hierachy there will be a target of access 
evaluation by Ranger and by the default authorizer (if the HDFS ACLs for each 
directory allow requested accesses).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to