*sigh*

I'm not sure how I am failing to communicate this but will try to briefly do it 
again…

I never asked for differences between the two silo'd jiras and am attempting to 
not speak to them within this thread as that is causing thrashing that we can't 
really afford.

There have been a number of folks working on security features within the 
community across projects. Many of these things have been rather isolated 
things that needed to be done and not much community involvement was needed. As 
we look into these larger endeavors working in silos without a cohesive 
community is a problem. We are trying to introduce a community for security as 
a cross cutting concern throughout the Hadoop ecosystem. 

In order to do this, we need to step back and approach the whole effort as a 
community. We identified a couple ways to start this:
1. using common-dev as the security community email list - at least for the 
time being
2. finding a wiki space to articulate a holistic view of the security model and 
drive changes from that common understanding
3. begin the community work by focusing on this authentication alternative to 
kerberos

Here is what was agreed upon to be discussed by the community for #3 above:
1. restart with a clean slate - define and meet the goals of the community with 
a single design/vision
2. scope the effort to authentication while keeping in mind not to preclude 
other related aspects of the Hadoop security roadmap - authorization, auditing, 
etc
3. we are looking for an alternative to kerberos authentication for users - not 
for services - for at least for the first phase services would continue to 
authenticate using kerberos - though it needs to be made easier
4. we would enumerate the high level components needed for this kerberos 
alternative
5. we would then drill down into the details of the components
5. finally identify the seams of separation that allow for parallel work and 
get the vision delivered

This email was intended to facilitate the discussion of those things.
To compare and contrast the two silo'd jiras sets this community work back 
instead of moving it forward.

We have a need with a very manageable scope and could use your help in defining 
from the context of your current work.

As Aaron stated, the community discussions around this topic have been 
encouraging and I also hope that they and the security community continue and 
grow.

Regarding the discussion points that still have not been addressed, I can see 
one possible additional component - though perhaps it is an aspect of the 
authentication providers - that you list below as a one of the "differences". 
That would be your thinking around the use of domains for multi-tenancy. I have 
trouble separating user domains from the IdPs deployed in the enterprise or 
cloud environment. Can you elaborate on how these domains relate to those that 
may be found within a particular IdP offering and how they work together or 
complement each other? We should be able to determine whether it is an aspect 
of the pluggable authentication providers or something that should be 
considered a separate component from that description.

I will be less available for the rest of the day - 4th of July stuff.

On Jul 4, 2013, at 7:21 AM, "Zheng, Kai" <kai.zh...@intel.com> wrote:

> Hi Larry,
> 
> Our design from its first revision focuses on and provides comprehensive 
> support to allow pluggable authentication mechanisms based on a common token, 
> trying to address single sign on issues across the ecosystem to support 
> access to Hadoop services via RPC, REST, and web browser SSO flow. The 
> updated design doc adds even more texts and flows to explain or illustrate 
> these existing items in details as requested by some on the JIRA.
> 
> Additional to the identity token we had proposed, we adopted access token and 
> adapted the approach not only for sake of making TokenAuth compatible with 
> HSSO, but also for better support of fine grained access control, and 
> seamless integration with our authorization framework and even 3rd party 
> authorization service like OAuth Authorization Server. We regard these as 
> important because Hadoop is evolving into an enterprise and cloud platform 
> that needs a complete authN and authZ solution and without this support we 
> would need future rework to complete the solution.
> 
> Since you asked about the differences between TokenAuth and HSSO, here are 
> some key ones:
> 
> TokenAuth supports TAS federation to allow clients to access multiple 
> clusters without a centralized SSO server while HSSO provides a centralized 
> SSO server for multiple clusters.
> 
> TokenAuth integrates authorization framework with auditing support in order 
> to provide a complete solution for enterprise data access security. This 
> allows administrators to administrate security polices centrally and have the 
> polices be enforced consistently across components in the ecosystem in a 
> pluggable way that supports different authorization models like RBAC, ABAC 
> and even XACML standards.
> 
> TokenAuth targets support for domain based authN & authZ to allow 
> multi-tenant deployments. Authentication and authorization rules can be 
> configured and enforced per domain, which allows organizations to manage 
> their individual policies separately while sharing a common large pool of 
> resources.
> 
> TokenAuth addresses proxy/impersonation case with flow as Tianyou mentioned, 
> where a service can proxy client to access another service in a secured and 
> constrained way.
> 
> Regarding token based authentication plus SSO and unified authorization 
> framework, HADOOP-9392 and HADOOP-9466 let's continue to use these as 
> umbrella JIRAs for these efforts. HSSO targets support for centralized SSO 
> server for multiple clusters and as we have pointed out before is a nice 
> subset of the work proposed on HADOOP-9392. Let's align these two JIRAs and 
> address the question Kevin raised multiple times in 9392/9533 JIRAs, "How can 
> HSSO and TAS work together? What is the relationship?". The design update I 
> provided was meant to provide the necessary details so we can nail down that 
> relationship and collaborate on the implementation of these JIRAs.
> 
> As you have also confirmed, this design aligns with related community 
> discussions, so let's continue our collaborative effort to contribute code to 
> these JIRAs.
> 
> Regards,
> Kai
> 
> -----Original Message-----
> From: Larry McCay [mailto:lmc...@hortonworks.com] 
> Sent: Thursday, July 04, 2013 4:10 AM
> To: Zheng, Kai
> Cc: common-dev@hadoop.apache.org
> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> 
> Hi Kai -
> 
> I think that I need to clarify something...
> 
> This is not an update for 9533 but a continuation of the discussions that are 
> focused on a fresh look at a SSO for Hadoop.
> We've agreed to leave our previous designs behind and therefore we aren't 
> really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS 
> discussion.
> 
> Your latest design revision actually makes it clear that you are now 
> targeting exactly what was described as HSSO - so comparing and contrasting 
> is not going to add any value.
> 
> What we need you to do at this point, is to look at those high-level 
> components described on this thread and comment on whether we need additional 
> components or any that are listed that don't seem necessary to you and why.
> In other words, we need to define and agree on the work that has to be done.
> 
> We also need to determine those components that need to be done before 
> anything else can be started.
> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all the 
> other components and should probably be defined and POC'd in short order.
> 
> Personally, I think that continuing the separation of 9533 and 9392 will do 
> this effort a disservice. There doesn't seem to be enough differences between 
> the two to justify separate jiras anymore. It may be best to file a new one 
> that reflects a single vision without the extra cruft that has built up in 
> either of the existing ones. We would certainly reference the existing ones 
> within the new one. This approach would align with the spirit of the 
> discussions up to this point.
> 
> I am prepared to start a discussion around the shape of the two Hadoop SSO 
> tokens: identity and access. If this is what others feel the next topic 
> should be.
> If we can identify a jira home for it, we can do it there - otherwise we can 
> create another DISCUSS thread for it.
> 
> thanks,
> 
> --larry
> 
> 
> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <kai.zh...@intel.com> wrote:
> 
>> Hi Larry,
>> 
>> Thanks for the update. Good to see that with this update we are now aligned 
>> on most points.
>> 
>> I have also updated our TokenAuth design in HADOOP-9392. The new revision 
>> incorporates feedback and suggestions in related discussion with the 
>> community, particularly from Microsoft and others attending the Security 
>> design lounge session at the Hadoop summit. Summary of the changes:
>> 1.    Revised the approach to now use two tokens, Identity Token plus Access 
>> Token, particularly considering our authorization framework and 
>> compatibility with HSSO;
>> 2.    Introduced Authorization Server (AS) from our authorization framework 
>> into the flow that issues access tokens for clients with identity tokens to 
>> access services;
>> 3.    Refined proxy access token and the proxy/impersonation flow;
>> 4.    Refined the browser web SSO flow regarding access to Hadoop web 
>> services;
>> 5.    Added Hadoop RPC access flow regarding CLI clients accessing Hadoop 
>> services via RPC/SASL;
>> 6.    Added client authentication integration flow to illustrate how desktop 
>> logins can be integrated into the authentication process to TAS to exchange 
>> identity token;
>> 7.    Introduced fine grained access control flow from authorization 
>> framework, I have put it in appendices section for the reference;
>> 8.    Added a detailed flow to illustrate Hadoop Simple authentication over 
>> TokenAuth, in the appendices section;
>> 9.    Added secured task launcher in appendices as possible solutions for 
>> Windows platform;
>> 10.    Removed low level contents, and not so relevant parts into appendices 
>> section from the main body.
>> 
>> As we all think about how to layer HSSO on TAS in TokenAuth framework, 
>> please take some time to look at the doc and then let's discuss the gaps we 
>> might have. I would like to discuss these gaps with focus on the 
>> implementations details so we are all moving towards getting code done. 
>> Let's continue this part of the discussion in HADOOP-9392 to allow for 
>> better tracking on the JIRA itself. For discussions related to Centralized 
>> SSO server, suggest we continue to use HADOOP-9533 to consolidate all 
>> discussion related to that JIRA. That way we don't need extra umbrella JIRAs.
>> 
>> I agree we should speed up these discussions, agree on some of the 
>> implementation specifics so both us can get moving on the code while not 
>> stepping on each other in our work.
>> 
>> Look forward to your comments and comments from others in the community. 
>> Thanks.
>> 
>> Regards,
>> Kai
>> 
>> -----Original Message-----
>> From: Larry McCay [mailto:lmc...@hortonworks.com]
>> Sent: Wednesday, July 03, 2013 4:04 AM
>> To: common-dev@hadoop.apache.org
>> Subject: [DISCUSS] Hadoop SSO/Token Server Components
>> 
>> All -
>> 
>> As a follow up to the discussions that were had during Hadoop Summit, I 
>> would like to introduce the discussion topic around the moving parts of a 
>> Hadoop SSO/Token Service.
>> There are a couple of related Jira's that can be referenced and may or may 
>> not be updated as a result of this discuss thread.
>> 
>> https://issues.apache.org/jira/browse/HADOOP-9533
>> https://issues.apache.org/jira/browse/HADOOP-9392
>> 
>> As the first aspect of the discussion, we should probably state the overall 
>> goals and scoping for this effort:
>> * An alternative authentication mechanism to Kerberos for user 
>> authentication
>> * A broader capability for integration into enterprise identity and 
>> SSO solutions
>> * Possibly the advertisement/negotiation of available authentication 
>> mechanisms
>> * Backward compatibility for the existing use of Kerberos
>> * No (or minimal) changes to existing Hadoop tokens (delegation, job, 
>> block access, etc)
>> * Pluggable authentication mechanisms across: RPC, REST and webui 
>> enforcement points
>> * Continued support for existing authorization policy/ACLs, etc
>> * Keeping more fine grained authorization policies in mind - like attribute 
>> based access control
>>      - fine grained access control is a separate but related effort that 
>> we must not preclude with this effort
>> * Cross cluster SSO
>> 
>> In order to tease out the moving parts here are a couple high level and 
>> simplified descriptions of SSO interaction flow:
>>                              +------+
>>      +------+ credentials 1 | SSO  |
>>      |CLIENT|-------------->|SERVER|
>>      +------+  :tokens      +------+
>>        2 |                    
>>          | access token
>>          V :requested resource
>>      +-------+
>>      |HADOOP |
>>      |SERVICE|
>>      +-------+
>>      
>> The above diagram represents the simplest interaction model for an SSO 
>> service in Hadoop.
>> 1. client authenticates to SSO service and acquires an access token  
>> a. client presents credentials to an authentication service endpoint 
>> exposed by the SSO server (AS) and receives a token representing the 
>> authentication event and verified identity  b. client then presents 
>> the identity token from 1.a. to the token endpoint exposed by the SSO 
>> server (TGS) to request an access token to a particular Hadoop service 
>> and receives an access token 2. client presents the Hadoop access 
>> token to the Hadoop service for which the access token has been 
>> granted and requests the desired resource or services  a. access token 
>> is presented as appropriate for the service endpoint protocol being 
>> used  b. Hadoop service token validation handler validates the token 
>> and verifies its integrity and the identity of the issuer
>> 
>>   +------+
>>   |  IdP |
>>   +------+
>>   1   ^ credentials
>>       | :idp_token
>>       |                      +------+
>>      +------+  idp_token  2 | SSO  |
>>      |CLIENT|-------------->|SERVER|
>>      +------+  :tokens      +------+
>>        3 |                    
>>          | access token
>>          V :requested resource
>>      +-------+
>>      |HADOOP |
>>      |SERVICE|
>>      +-------+
>>      
>> 
>> The above diagram represents a slightly more complicated interaction model 
>> for an SSO service in Hadoop that removes Hadoop from the credential 
>> collection business.
>> 1. client authenticates to a trusted identity provider within the 
>> enterprise and acquires an IdP specific token  a. client presents 
>> credentials to an enterprise IdP and receives a token representing the 
>> authentication identity 2. client authenticates to SSO service and 
>> acquires an access token  a. client presents idp_token to an 
>> authentication service endpoint exposed by the SSO server (AS) and 
>> receives a token representing the authentication event and verified 
>> identity  b. client then presents the identity token from 2.a. to the 
>> token endpoint exposed by the SSO server (TGS) to request an access 
>> token to a particular Hadoop service and receives an access token 3. 
>> client presents the Hadoop access token to the Hadoop service for 
>> which the access token has been granted and requests the desired 
>> resource or services  a. access token is presented as appropriate for 
>> the service endpoint protocol being used  b. Hadoop service token 
>> validation handler validates the token and verifies its integrity and 
>> the identity of the issuer
>>      
>> Considering the above set of goals and high level interaction flow 
>> description, we can start to discuss the component inventory required to 
>> accomplish this vision:
>> 
>> 1. SSO Server Instance: this component must be able to expose endpoints for 
>> both authentication of users by collecting and validating credentials and 
>> federation of identities represented by tokens from trusted IdPs within the 
>> enterprise. The endpoints should be composable so as to allow for 
>> multifactor authentication mechanisms. They will also need to return tokens 
>> that represent the authentication event and verified identity as well as 
>> access tokens for specific Hadoop services.
>> 
>> 2. Authentication Providers: pluggable authentication mechanisms must be 
>> easily created and configured for use within the SSO server instance. They 
>> will ideally allow the enterprise to plugin their preferred components from 
>> off the shelf as well as provide custom providers. Supporting existing 
>> standards for such authentication providers should be a top priority 
>> concern. There are a number of standard approaches in use in the Java world: 
>> JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A pluggable 
>> provider architecture that allows the enterprise to leverage existing 
>> investments in these technologies and existing skill sets would be ideal.
>> 
>> 3. Token Authority: a token authority component would need to have the 
>> ability to issue, verify and revoke tokens. This authority will need to be 
>> trusted by all enforcement points that need to verify incoming tokens. Using 
>> something like PKI for establishing trust will be required.
>> 
>> 4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will need 
>> to be considered in order to determine the means by which trust and 
>> integrity are ensured while using them. There may be some abstraction of the 
>> underlying format provided through interface based design but all token 
>> implementations will need to have the same attributes and capabilities in 
>> terms of validation and cryptographic verification.
>> 
>> 5. SSO Protocol: the lowest common denominator protocol for SSO server 
>> interactions across client types would likely be REST. Depending on the REST 
>> client in use it may require explicitly coding to the token flow described 
>> in the earlier interaction descriptions or a plugin may be provided for 
>> things like HTTPClient, curl, etc. RPC clients will have this taken care for 
>> them within the SASL layer and will leverage the REST endpoints as well. 
>> This likely implies trust requirements for the RPC client to be able to 
>> trust the SSO server's identity cert that is presented over SSL. 
>> 
>> 6. REST Client Agent Plugins: required for encapsulating the interaction 
>> with the SSO server for the client programming models. We may need these for 
>> many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.
>> 
>> 7. Server Side Authentication Handlers: the server side of the REST, RPC or 
>> webui connection will need to be able to validate and verify the incoming 
>> Hadoop tokens in order to grant or deny access to requested resources.
>> 
>> 8. Credential/Trust Management: throughout the system - on client and server 
>> sides - we will need to manage and provide access to PKI and potentially 
>> shared secret artifacts in order to establish the required trust 
>> relationships to replace the mutual authentication that would be otherwise 
>> provided by using kerberos everywhere.
>> 
>> So, discussion points:
>> 
>> 1. Are there additional components that would be required for a Hadoop SSO 
>> service?
>> 2. Should any of the above described components be considered not actually 
>> necessary or poorly described?
>> 2. Should we create a new umbrella Jira to identify each of these as a 
>> subtask?
>> 3. Should we just continue to use 9533 for the SSO server and add additional 
>> subtasks?
>> 4. What are the natural seams of separation between these components and any 
>> dependencies between one and another that affect priority?
>> 
>> Obviously, each component that we identify will have a jira of its own - 
>> more than likely - so we are only trying to identify the high level 
>> descriptions for now.
>> 
>> Can we try and drive this discussion to a close by the end of the week? This 
>> will allow us to start breaking out into component implementation plans.
>> 
>> thanks,
>> 
>> --larry
> 

Reply via email to