[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150819#comment-16150819
 ] 

Yongjun Zhang commented on HDFS-12357:
--------------------------------------

HI [~chris.douglas],

Thanks a lot for your comment.

Some thoughts:

- I assumed that the external attribute provider is not expected to have 
knowledge of NameNode, is this not the case? 
- I agree that if we call NameNode.getRemoteUser in external provider, we can 
implement the same logic in the provider. However, that means all different 
providers (sentry, ranger etc) need to be fixed accordingly, otherwise we will 
get unexpected result. Is this what we want to do?
- The problem here is to decide whether to consult ext provider based on user, 
not based on user/path combination. So it seems more clear to let NN to decide 
whether to consult ext provider. If we let the provider to decide, and if there 
is bug in the provider, we will get unexpected result.
- Operation-wise, to change all provider's implementation and update clusters 
is more expensive. 

What do you think about these points?

Thanks.



> Let NameNode to bypass external attribute provider for special user
> -------------------------------------------------------------------
>
>                 Key: HDFS-12357
>                 URL: https://issues.apache.org/jira/browse/HDFS-12357
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to