Abhay Kulkarni created RANGER-4745:
--------------------------------------
Summary: Enhance handling of subAccess authorization in Ranger
HDFS plugin
Key: RANGER-4745
URL: https://issues.apache.org/jira/browse/RANGER-4745
Project: Ranger
Issue Type: Improvement
Components: Ranger
Reporter: Abhay Kulkarni
Currently Ranger performs authorization of the HDFS commands which require
access to the hierarchy of files/directory rooted at the argument passed to the
HDFS command as described below. Some examples of such commands are :
{quote}hdfs dfs -count -q -h -v <directory>; hdfs dfs -R <directory>
{quote}
HDFS Authorization Interface
When these commands are invoked, HDFS Namenode builds a tree of i-nodes
corresponding to <directory>, and passes it to the authorizer with a flag
indicating that subAccess (access to the directory hierarchy rooted at
<directory>) is to be checked.
Ranger implementation
For each directory in the hierarchy rooted at <directory>, Ranger code checks
if the requested permissions (typically read and execute) are allowed using
only Ranger policies. If any directory in the top-down path starting from
<directory> does not allow access, then the authorization steps done until then
are discarded, and the HDFS default authorizer is called upon to check the
access with the same arguments. The default authorizer only checks the HDFS
ACLs (and not any Ranger policies) on each directory in the hierarchy to
determine the access.
Design of new Ranger implementation
For each directory in the hierarchy rooted at <directory>, new Ranger design
1. Checks if the requested permissions are allowed using only Ranger policies
2. If the access is denied, the authorization steps done until this point are
discarded, and the HDFS default authorizer is called upon to check the access
with the original set of argument, and the result of default authorizer is
returned to Namenode.
3. Otherwise, if the access is not determined, a new set of arguments are
constructed for the directory being processed and HDFS default authorizer is
called to check the access with the modified set of arguments.
4. If the default authorizer does not allow the access, then the result is
returned to Namenode.
5. Otherwise, the processing continues with the next directory.
Performance considerations
The new implementation may have some impact on the performance. A few cases are
as follows.
1. There is a Ranger policy that allows requested permissions recursively to
some directory in the hierarchy. Depending on how deep this directory is in the
hierarchy, the number of directories for which the access evaluation is
requested will change. Higher this directory in the hierarchy, lesser the
number of evaluations. In the existing implementation, a short-circuiting of
calls for evaluating Ranger policies will, in general, happen earlier, and the
default authorizer will be called upon the handle the authorization.
2. In the worst case, if there is no Ranger policy for any directory in the
hierarchy, then each directory in the hierachy there will be a target of access
evaluation by Ranger and by the default authorizer (if the HDFS ACLs for each
directory allow requested accesses).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)