On Mon, Sep 24, 2018, at 8:40 PM, John Dennis wrote:
> On 9/24/18 8:00 AM, Colleen Murphy wrote:
> > This is in regard to https://launchpad.net/bugs/1641625 and the proposed 
> > patch https://review.openstack.org/588211 for it. Thanks Vishakha for 
> > getting the ball rolling.
> > 
> > tl;dr: Keystone as an IdP should support sending 
> > non-strings/lists-of-strings as user attribute values, specifically lists 
> > of keystone groups, here's how that might happen.
> > 
> > Problem statement:
> > 
> > When keystone is set up as a service provider with an external non-keystone 
> > identity provider, it is common to configure the mapping rules to accept a 
> > list of group names from the IdP and map them to some property of a local 
> > keystone user, usually also a keystone group name. When keystone acts as 
> > the IdP, it's not currently possible to send a group name as a user 
> > property in the assertion. There are a few problems:
> >      
> >      1. We haven't added any openstack_groups key in the creation of the 
> > SAML assertion 
> > (http://git.openstack.org/cgit/openstack/keystone/tree/keystone/federation/idp.py?h=14.0.0#n164).
> >      2. If we did, this would not be enough. Unlike other IdPs, in keystone 
> > there can be multiple groups with the same name, namespaced by domain. So 
> > it's not enough for the SAML AttributeStatement to contain a 
> > semi-colon-separated list of group names, since a user could theoretically 
> > be a member of two or more groups with the same name.
> >     * Why can't we just send group IDs, which are unique? Because two 
> > different keystones are not going to have independent groups with the same 
> > UUID, so we cannot possibly map an ID of a group from keystone A to the ID 
> > of a different group in keystone B. We could map the ID of the group in in 
> > A to the name of a group in B but then operators need to create groups with 
> > UUIDs as names which is a little awkward for both the operator and the user 
> > who now is a member of groups with nondescriptive names.
> >      3. If we then were able to encode a complex type like a group dict in 
> > a SAML assertion, we'd have to deal with it on the service provider side by 
> > being able to parse such an environment variable from the Apache headers.
> >      4. The current mapping rules engine uses basic python string 
> > formatting to translate remote key-value pairs to local rules. We would 
> > need to change the mapping API to work with values more complex than 
> > strings and lists of strings.
> >      
> > Possible solution:
> > 
> > Vishakha's patch (https://review.openstack.org/588211) starts to solve (1) 
> > but it doesn't go far enough to solve (2-4). What we talked about at the 
> > PTG was:
> >      
> >      2. Encode the group+domain as a string, for example by using the dict 
> > string repr or a string representation of some custom XML and maybe base64 
> > encoding it.
> >          * It's not totally clear whether the AttributeValue class of the 
> > pysaml2 library supports any data types outside of the xmlns:xs namespace 
> > or whether nested XML is an option, so encoding the whole thing as an 
> > xs:string seems like the simplest solution.
> >      3. The SP will have to be aware that openstack_groups is a special key 
> > that needs the encoding reversed.
> >          * I wrote down "MultiDict" in my notes but I don't recall exactly 
> > what format the environment variable would take that would make a MultiDict 
> > make sense here, in any case I think encoding the whole thing as a string 
> > eliminates the need for this.
> >      4. We didn't talk about the mapping API, but here's what I think. If 
> > we were just talking about group names, the mapping API today would work 
> > like this (slight oversimplification for brevity):
> >          
> > Given a list of openstack_groups like ["A", "B", "C"], it would work like 
> > this:
> >              
> > [
> >    {
> >      "local":
> >      [
> >        {
> >          "group":
> >          {
> >            "name": "{0}",
> >            "domain":
> >            {
> >              "name": "federated_domain"
> >            }
> >          }
> >        }
> >      ], "remote":
> >      [
> >        {
> >          "type": "openstack_groups"
> >        }
> >      ]
> >    }
> > ]
> > (paste in case the spacing makes this unreadable: 
> > http://paste.openstack.org/show/730623/ )
> > 
> > But now, we no longer have a list of strings but something more like 
> > [{"name": "A", "domain_name": "Default"} {"name": "B", "domain_name": 
> > "Default", "name": "A", "domain_name": "domainB"}]. Since {0} isn't a 
> > string, this example doesn't really work. Instead, let's assume that in 
> > step (3) we converted the decoded AttributeValue text to an object. Then 
> > the mapping could look more like this:
> >          
> > [
> >    {
> >      "local":
> >      [
> >        {
> >          "group":
> >          {
> >            "name": "{0.name}",
> >            "domain":
> >            {
> >              "name": "{0.domain_name}"
> >            }
> >          }
> >        }
> >      ], "remote":
> >      [
> >        {
> >          "type": "openstack_groups"
> >        }
> >      ]
> >    }
> > ]
> > (paste: http://paste.openstack.org/show/730622/ )
> > 
> > Alternatively, we could forget about the namespacing problem and simply say 
> > we only pass group names in the assertion, and if you have ambiguous group 
> > names you're on your own. We could also try to support both, e.g. have an 
> > openstack_groups mean a list of group names for simpler use cases, and 
> > openstack_groups_unique mean the list of encoded group+domain strings for 
> > advanced use cases.
> > 
> > Finally, whatever we decide for groups we should also apply to 
> > openstack_roles which currently only supports global roles and not 
> > domain-specific roles.
> > 
> > (It's also worth noting, for clarity, that the samlize function does handle 
> > namespaced projects, but this is because it's retrieving the project from 
> > the token and therefore there is only ever one project and one project 
> > domain so there is no ambiguity.)
> > 
> 
> A few thoughts to help focus the discussion:
> 
> * Namespacing is critical, no design should be permitted which allows 
> for ambiguous names. Ambiguous names are a security issue and can be 
> used by an attacker. The SAML designers recognized the importance to 
> disambiguate names. In SAML names are conveyed inside a NameIdentifier 
> element which (optionally) includes "name qualifier" attributes which in 
> SAML lingo is a namespace name.
> 
> * SAML does not define the format of an attribute value. You can use 
> anything you want as long as it can be expressed in valid XML as long as 
> the cooperating parties know how to interpret the XML content. But 
> herein lies the problem. Very few SAML implementations know how to 
> consume an attribute value other than a string. In the real world, 
> despite what the SAML spec says is permitted is the constraint attribute 
> values is a string.
> 
> * I haven't looked at the pysaml implementation but I'd be surprised if 
> it treated attribute values as anything other than a string. In theory 
> it could take any Python object (or JSON) and serialize it into XML but 
> you would still be stuck with the receiver being unable to parse the 
> attribute value (see above point).
> 
> * You can encode complex data in an attribute value while only using a 
> simple string. The only requirement is the relying party knowing how to 
> interpret the string value. Note, this is distinctly different than 
> using non-string attribute values because of who is responsible for 
> parsing the value. If you use a non-string attribute value the SAML 
> library need to know how to parse it, none or very few will know how to 
> process that element. But if it's a string value the SAML library will 
> happily pass that string back up to the application who can then 
> interpret it. The easiest way to embed complex data in a string is with 
> JSON, we do it all the time, all over the place in OpenStack. [1][2]
> 
> So my suggestion would be to give the attribute a meaningful name. 
> Define a JSON schema for the data and then let the upper layers decode 
> the JSON and operate on it. This is no different than any other SAML 
> attribute passed as a string, the receive MUST know how to interpret the 
> string value.
> 
> [1] We already pass complex data in a SAML attribute string value. We 
> permit a comma separated list of group names to appear in the 'groups' 
> mapping rule (although I don't think this feature is documented in our 
> mapping rules documentation). The receiver (our mapping engine) has 
> hard-coded logic to look for a list of names.
> 
> [2] We might want to prepend a format specifier to string containing 
> complex data, e.g. "JSON:{json object}". Our parser could then look for 
> a leading format tag and if if finds one strip it off and pass the rest 
> of the string into the proper parser.
> 
> -- 
> John Dennis
> 

Thanks for this response, John. I think serializing to JSON and prepending a 
format specifier makes sense.

Colleen

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to