[ 
https://issues.apache.org/jira/browse/HADOOP-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13704360#comment-13704360
 ] 

Tianyou Li commented on HADOOP-9392:
------------------------------------

Hi Brian,

Thanks for reviewing and providing feedback on the design. You have asked some 
good questions so let me try to add some more context on the design choices and 
why we made them. Hopefully this additional context will shed some clarity. 
Please feel free to ask if you still have questions or concerns.

> 1. The new diagram (p. 3) that describes client/TAS/AS/IdP/Hadoop Services 
> interaction shows a client providing credentials to TAS, which then provides 
> the credentials to the IdP. From a security perspective, this seems like a 
> bad idea. It defeats the purpose of having an IdP in the first place. Is this 
> an oversight or by design?
 
>From client point of view, the TAS should be trusted by client for 
>authentication, whether or not client credentials can be passed to TAS 
>directly depends on the IdP’s capability and the deployment decisions etc. If 
>IdP can generate a token and is federated with TAS, then the token can be used 
>to authenticate with TAS to generate identity token in Hadoop cluster. If IdP 
>does not have the capability of generate trusted token e.g. LDAP, then there 
>can be several alternate solutions that depends on the deployment scenario.

The first scenario is TAS and IdP are deployed in the same organization in the 
same network, TAS can access IdP directly, in this scenario credentials are 
passed to TAS securely (over ssl) and then TAS pass the credential to IdP like 
LDAP. The second scenario is TAS and IdP are deployed separately in different 
network, TAS cannot contact the IdP directly, for example LDAP server is 
resident inside of enterprise and TAS is deployed in the cloud, and client is 
trying to access cluster from enterprise. In this scenario, an agent trusted by 
client can be deployed to collect client credentials, pass them to LDAP (aka 
the IdP), and generate token to external TAS to complete the authentication 
process. This agent can be another TAS as well. The third scenario is similar 
to the second scenario but the only difference is client is trying to access 
cluster from public network for example cloud environment, but need to used 
enterprise LDAP as IdP. In this scenario, an agent (can be TAS) needs to be 
deployed as gateway on the enterprise side to collect credentials.

In any of the above scenario, for an IdP without the capability to generate 
token as a result of the authentication, TAS can be the agent trusted by client 
to collect credentials for first mile authentication. As a result of above 
consideration, we draw the flow as it shows in page 3.

> 2. I'm not sure I understand why AS is necessary. It seems to complicate the 
> design by adding an unnecessary authorization check - authorization 
> can/should happen at individual Hadoop services based on token attributes. I 
> think you have mentioned before that authorization (with AS in place) would 
> happen at both places (some level of authz at AS and finer grained authz at 
> services). Can you elaborate on what value that adds over doing authz at 
> services only? And, can you provide an example of what authz checks would 
> happen at each place? (Say I access NameNode. What authz checks are done at 
> AS and what is done at the service?)
 
I would like to agree with you that authorization can be pushed into service 
side but having a centralized authorization has some advantages. For example: 
any authZ policy changes can be enforced immediately instead of waiting for the 
policy sync to each service. This also provides a centralized place for 
auditing client access. The centralized authZ acts much like the service level 
authZ except it’s centralized for reasons I just mentioned. (In the scenario 
you mentioned, if you went to access HDFS service, you need to have access 
token granted with authZ policy defined, once you have the access token you 
have access to the HDFS service but that does not mean you can access any file 
in HDFS, the file/directory level access control is done by HDFS itself.)
 
> 3. I believe this has been mentioned before, but the scope of this document 
> makes it very difficult to move forward with contributing code. It would be 
> very helpful to understand how you envision breaking this down into work 
> items that the community can pick up (I think this is what the DISCUSS thread 
> on common-dev was attempting to do).

This one I am trying to understand a little better. Please help me understand 
what you mean by “… scope of this document makes it very difficult to move 
forward with contributing code.”? If we were to breakdown the jira in to a 
number of sub-tasks based on the document would that be helpful?

Regards.

                
> Token based authentication and Single Sign On
> ---------------------------------------------
>
>                 Key: HADOOP-9392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9392
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: security
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>             Fix For: 3.0.0
>
>         Attachments: token-based-authn-plus-sso.pdf, 
> token-based-authn-plus-sso-v2.0.pdf
>
>
> This is an umbrella entry for one of project Rhino’s topic, for details of 
> project Rhino, please refer to 
> https://github.com/intel-hadoop/project-rhino/. The major goal for this entry 
> as described in project Rhino was 
>  
> “Core, HDFS, ZooKeeper, and HBase currently support Kerberos authentication 
> at the RPC layer, via SASL. However this does not provide valuable attributes 
> such as group membership, classification level, organizational identity, or 
> support for user defined attributes. Hadoop components must interrogate 
> external resources for discovering these attributes and at scale this is 
> problematic. There is also no consistent delegation model. HDFS has a simple 
> delegation capability, and only Oozie can take limited advantage of it. We 
> will implement a common token based authentication framework to decouple 
> internal user and service authentication from external mechanisms used to 
> support it (like Kerberos)”
>  
> We’d like to start our work from Hadoop-Common and try to provide common 
> facilities by extending existing authentication framework which support:
> 1.    Pluggable token provider interface 
> 2.    Pluggable token verification protocol and interface
> 3.    Security mechanism to distribute secrets in cluster nodes
> 4.    Delegation model of user authentication

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to