Re: [OAUTH-WG] Namespacing "type" in RAR

Dick Hardt Tue, 21 Jul 2020 10:57:15 -0700

In unicode, a glyph can be represented by more than one code point. When
reading the docs and entering a value, the developer will not know which
code point the AS intended.


Are you suggesting that AS documentation would have the bytes rather than
glyphs? Or not use glyphs that have multiple code points? Or that they only
use english?



On Tue, Jul 21, 2020 at 10:34 AM Justin Richer <jric...@mit.edu> wrote:

> Right, and I’m saying that all three of those would be DIFFERENT “type”
> values, because they’re different strings. The fact that when treated as
> URIs they would be equivalent is irrelevant. Just like “foo”, “Foo”, and
> “FOO” would be different “type” values, per the spec. Nothing is stopping
> an AS from treating them as equivalent internally, but that seems a bit
> dangerous to me. I’d love to see a formal breakdown of that, though.
>
> As for the unicode example, if we define things as using byte comparisons,
> then that becomes an issue for proper documentation and configuration — and
> again, probably a good place to have recommendations for picking type value
> strings so as to avoid such problems.
>
> In short, I don’t think we should have any requirements on
> canonicalization for these values.
>
>  — Justin
>
> On Jul 21, 2020, at 1:03 PM, Dick Hardt <dick.ha...@gmail.com> wrote:
>
>
> The following are the same URI, but are different strings:
>
> “https://schema.example.org/v1”
> “HTTPS://schema.example.org/v1 <https://schema.example.org/v1>”
> “https://SCHEMA.EXAMPLE.ORG/v1 <https://schema.example.org/v1>”
>
> Before comparing them to each other, they must be canonicalized so that
> they become the same string.
>
> From earlier in this thread, I am NOT suggesting that it must be a URI,
> nor that it is required:
>
> Since the type represents a much more complex object then a JWT claim, a
> client developer's tooling could pull down the JSON Schema (or some such)
> for a type used in their source code, and provide autocompletion and
> validation which would improve productivity and reduce errors. An AS that
> is using a defined type could use the schema for input validation. Neither
> of these would be at run time. JSON Schema allows comments and examples.
>
> What is the harm in non-normative language around a retrievable URI?
>
>
> On Tue, Jul 21, 2020 at 9:58 AM Justin Richer <jric...@mit.edu> wrote:
>
>> String comparison works just fine when the strings happen to be URIs, and
>> you aren’t treating them as URIs:
>>
>> “https://schema.example.org/v1”
>>
>> Is different from
>>
>> “https://schema.example.org/v2”
>>
>> And both are different from
>>
>> “https://schema.example.org:443/v1 <https://schema.example.org/v1>/“
>>
>> All of these are strings, and the strings happen to be URIs but that’s
>> irrelevant to the comparison process. Can you please help me understand why
>> doing a string comparison on these values does not work in exactly the same
>> way it would for “foo”, “bar”, and “baz” values? Why would these need to be
>> canonicalized to be compared? The definition of a JSON string is an ordered
>> set of unicode code points, and this can be compared byte-wise. (Or
>> code-point-wise, whatever’s most correct here.) Can you give me
>> counter-examples as to where string comparison doesn’t work? And can you
>> help me understand how this same worry doesn’t apply to all of the rest of
>> the values in the RAR specification, which are also strings and will need
>> to be compared?
>>
>> I’m still very confused as to the URI retrieval issue here, if there even
>> is one. It sounds like we’re both saying that it could be useful if type
>> values are retrievable when they’re URIs, but that would be something to
>> augment a process and not required for the RAR spec. I’m against requiring
>> the value to be a URI and against requiring the AS to process that URI *as
>> a URI* at runtime. Anything that an AS wants to do with the “type”
>> value, including providing additional tooling and validation, is up to the
>> AS and outside of the spec.
>>
>>  — Justin
>>
>> On Jul 21, 2020, at 12:35 PM, Dick Hardt <dick.ha...@gmail.com> wrote:
>>
>> This statement:
>>
>> “compare two strings so that they’re exact”
>>
>> does not work for either Unicode or URIs. A string, and a canonicalized
>> Unicode string are not the same thing. Similar for a URI. I have assumed
>> you understand the canonicalization requirement, but it does not sound like
>> you do. Would you like examples?
>>
>>
>> wrt. the AS and URI, *you* keep saying that *I* said the AS would
>> retrieve the URI. I HAVE NOT SAID THAT!
>>
>> I am suggesting that the URI MAY be retrievable, and I gave examples on
>> how that would be useful for tooling for client developers, and for an AS
>> in doing input validation. The URI would NOT be retrieved at run time.
>>
>>
>> On Tue, Jul 21, 2020 at 7:35 AM Justin Richer <jric...@mit.edu> wrote:
>>
>>> If we treat all the strings as just strings, without any special
>>> internal format to be specified or detected, then comparing the strings is
>>> a well-understood and well-documented process. I also think that we
>>> shouldn’t invent anything here, so if there’s a better way to say “compare
>>> two strings so that they’re exact” then that’s what I mean. Sorry if that
>>> was unclear.
>>>
>>> I’m saying the AS should *not* retrieve the URI passed in the “type”
>>> value. You brought that up and then described the process that the AS would
>>> take to do so. I have said from the start that the use of a URI is for name
>>> spacing and not for addressing content to be fetched, so I’m confused why
>>> you think I intend otherwise.
>>>
>>>  — Justin
>>>
>>> On Jul 20, 2020, at 2:59 PM, Dick Hardt <dick.ha...@gmail.com> wrote:
>>>
>>> Canonicalization of URIs and unicode is fairly well specified. I was not
>>> suggesting we invent anything there.
>>>
>>> A byte comparison, as you suggested earlier, will be problematic, as I
>>> have pointed out.
>>>
>>> I'm confused why you are still talking about the AS retrieving a URI.
>>>
>>> ᐧ
>>>
>>> On Mon, Jul 20, 2020 at 4:42 AM Justin Richer <jric...@mit.edu> wrote:
>>>
>>>> Since this is a recommendation for namespace, we could also just say
>>>> collision-resistant like JWT, and any of those examples are fine. But that
>>>> said, I think there’s something particularly compelling about URIs since
>>>> they have somewhat-human-readable portions. But again, I’m saying it should
>>>> be a recommendation to API developers and not a requirement in the spec. In
>>>> the spec, I argue that “type” should be a string, full stop.
>>>>
>>>> If documentation is so confusing that developers are typing in the
>>>> wrong strings, then that’s bad documentation. And likely a bad choice for
>>>> the “type” string on the part of the AS. You’d have the same problem with
>>>> any other value the developer’s supposed to copy over.  :)
>>>>
>>>> I agree that we should call out explicitly how they should be compared,
>>>> and I propose we use one of the handful of existing string-comparison RFC’s
>>>> here instead of defining our own rules.
>>>>
>>>> While the type could be a dereferenceable URI, requiring action on the
>>>> AS is really getting into distributed authorization policies. We tried
>>>> doing that with UMA1’s scope structures and it didn’t work very well in
>>>> practice (in my memory and experience). Someone could profile “type" on top
>>>> of this if they wanted to do so, with support at the AS for that, but I
>>>> don’t see a compelling reason for that to be a requirement as that’s a lot
>>>> of complexity and a lot more error states (the fetch fails, or it doesn’t
>>>> have a policy, or the policy’s in a format the AS doesn’t understand, or
>>>> the AS doesn’t like the policy, etc).
>>>>
>>>> And AS is always free to implement its types in such a fashion, and
>>>> that could make plenty of sense in a smaller ecosystem. And this is yet
>>>> another reason that we define “type” as being a string to be interpreted
>>>> and understood by the AS — so that an AS that wants to work this way can do
>>>> so.
>>>>
>>>>  — Justin
>>>>
>>>> PS: thanks for pointing out the error in the example in XYZ, I’ll fix
>>>> that prior to publication.
>>>>
>>>> On Jul 18, 2020, at 8:58 PM, Dick Hardt <dick.ha...@gmail.com> wrote:
>>>>
>>>> Justin: thanks for kindly pointing out which mail list this is.
>>>>
>>>> To clarify, public JWT claims are not just URIs, but any
>>>> collision-resistant namespace:
>>>> "Examples of collision-resistant namespaces include: Domain Names,
>>>> Object Identifiers (OIDs) as defined in the ITU-T X.660 and      X.670
>>>> Recommendation series, and Universally Unique IDentifiers (UUIDs)
>>>> [RFC4122]."
>>>>
>>>> I think letting the "type" be any JSON string and doing a byte-wise
>>>> comparison will be problematic. A client developer will be reading
>>>> documentation to learn what the types are, and typing it in. Given the wide
>>>> set of whitespace characters, and unicode equivalence, different byte
>>>> streams will all look the same, and a byte-wise comparison will fail.
>>>>
>>>> Similarly for URIs. If it is a valid URI, then a byte-wise comparison
>>>> is not sufficient. Canonicalization is required.
>>>>
>>>> These are not showstopper issues, but the specification should call out
>>>> how type strings are compared, and provide caveats to an AS developer.
>>>>
>>>> I have no idea why you would think the AS would retrieve a URL.
>>>>
>>>> Since the type represents a much more complex object then a JWT claim,
>>>> a client developer's tooling could pull down the JSON Schema (or some such)
>>>> for a type used in their source code, and provide autocompletion and
>>>> validation which would improve productivity and reduce errors. An AS that
>>>> is using a defined type could use the schema for input validation. Neither
>>>> of these would be at run time. JSON Schema allows comments and examples.
>>>>
>>>> What is the harm in non-normative language around a retrievable URI?
>>>>
>>>> BTW: the example in
>>>> https://oauth.xyz/draft-richer-transactional-authz#rfc.section.2 has
>>>> not been updated with the "type" field.
>>>>
>>>>
>>>>
>>>> On Sat, Jul 18, 2020 at 8:10 AM Justin Richer <jric...@mit.edu> wrote:
>>>>
>>>>> Hi Dick,
>>>>>
>>>>> This is a discussion about the RAR specification on the OAuth list,
>>>>> and therefore doesn’t have anything to do with alignment with XAuth. In
>>>>> fact, I believe the alignment is the other way around, as doesn’t Xauth
>>>>> normatively reference RAR at this point? Even though, last I saw, it uses 
>>>>> a
>>>>> different top-level structure for conveying things, I believe it does say
>>>>> to use the internal object structures. I am also a co-author on RAR and we
>>>>> had already defined a “type” field in RAR quite some time ago. You did
>>>>> notice that XYZ’s latest draft added this field to keep the two in
>>>>> alignment with each other, which has always been the goal since the 
>>>>> initial
>>>>> proposal of the RAR work, but that’s a time lag and not a display of new
>>>>> intent.
>>>>>
>>>>> In any event, even though I think the decision has bearing in both
>>>>> places, this isn’t about GNAP. Working on RAR’s requirements has brought 
>>>>> up
>>>>> this interesting issue of what should be in the type field for RAR in 
>>>>> OAuth
>>>>> 2.
>>>>>
>>>>> I think that it should be defined as a string, and therefore compared
>>>>> as a byte value in all cases, regardless of what the content of the string
>>>>> is. I don’t think the AS should be expected to fetch a URI for anything. I
>>>>> don’t think the AS should normalize any of the inputs. I think that any
>>>>> JSON-friendly character set should be allowed (including spaces and
>>>>> unicodes), and since RAR already requires the JSON objects to be
>>>>> form-encoded, this shouldn’t cause additional trouble when adding them in
>>>>> to OAuth 2’s request structures.
>>>>>
>>>>> The idea of using a URI would be to get people out of each other’s
>>>>> namespaces. It’s similar to the concept of “public” vs “private” claims in
>>>>> JWT:
>>>>>
>>>>> https://tools.ietf.org/html/rfc7519#section-4.2
>>>>>
>>>>> What I’m proposing is that if you think it’s going to be a
>>>>> general-purpose type name, then we recommend you use a URI as your string.
>>>>> And beyond that, that’s it. It’s up to the AS to figure out what to do 
>>>>> with
>>>>> it, and RAR stays out of it.
>>>>>
>>>>>  — Justin
>>>>>
>>>>> On Jul 17, 2020, at 1:25 PM, Dick Hardt <dick.ha...@gmail.com> wrote:
>>>>>
>>>>> Hey Justin, glad to see that you have aligned with the latest XAuth
>>>>> draft on a type property being required.
>>>>>
>>>>> I like the idea that the value of the type property is fully defined
>>>>> by the AS, which could delegate it to a common URI for reuse. This gets
>>>>> GNAP out of specifying access requests, and enables other parties to 
>>>>> define
>>>>> access without any required coordination with IETF or IANA.
>>>>>
>>>>> A complication in mixing plain strings and URIs is the
>>>>> canonicalization. A plain string can be a fixed byte representation, but a
>>>>> URI requires canonicalization for comparison. Mixing the two requires URI
>>>>> detection at the AS before canonicalization, and an AS MUST do
>>>>> canonicalization of URIs.
>>>>>
>>>>> The URI is retrievable, it can provide machine and/or human readable
>>>>> documentation in JSON schema or some such, or any other content type. Once
>>>>> again, the details are out of scope of GNAP, but we can provide examples 
>>>>> to
>>>>> guide implementers.
>>>>>
>>>>> Are you still thinking that bare strings are allowed in GNAP, and are
>>>>> defined by the AS?
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 17, 2020 at 8:39 AM Justin Richer <jric...@mit.edu> wrote:
>>>>>
>>>>>> The “type” field in the RAR spec serves an important purpose: it
>>>>>> defines what goes in the rest of the object, including what other fields
>>>>>> are available and what values are allowed for those fields. It provides 
>>>>>> an
>>>>>> API-level definition for requesting access based on multiple dimensions,
>>>>>> and that’s really powerful and flexible. Each type can use any of the
>>>>>> general-purpose fields like “actions” and/or add its own fields as
>>>>>> necessary, and the “type” parameter keeps everything well-defined.
>>>>>>
>>>>>> The question, then, is what defines what’s allowed to go into the
>>>>>> “type” field itself? And what defines how that value maps to the
>>>>>> requirements for the rest of the object? The draft doesn’t say anything
>>>>>> about it at the moment, but we should choose the direction we want to go.
>>>>>> On the surface, there are three main options:
>>>>>>
>>>>>> 1) Require all values to be registered.
>>>>>> 2) Require all values to be collision-resistant (eg, URIs).
>>>>>> 3) Require all values to be defined by the AS (and/or the RS’s that
>>>>>> it protects).
>>>>>>
>>>>>> Are there any other options?
>>>>>>
>>>>>> Here are my thoughts on each approach:
>>>>>>
>>>>>> 1) While it usually makes sense to register things for
>>>>>> interoperability, this is a case where I think that a registry would
>>>>>> actually hurt interoperability and adoption. Like a “scope” value, the 
>>>>>> RAR
>>>>>> “type” is ultimately up to the AS and RS to interpret in their own 
>>>>>> context.
>>>>>> We :want: people to define rich objects for their APIs and enable
>>>>>> fine-grained access for their systems, and if they have to register
>>>>>> something every time they come up with a new API to protect, it’s going 
>>>>>> to
>>>>>> be an unmaintainable mess. I genuinely don’t think this would scale, and
>>>>>> that most developers would just ignore the registry and do what they want
>>>>>> anyway. And since many of these systems are inside domains, it’s 
>>>>>> completely
>>>>>> unenforceable in practice.
>>>>>>
>>>>>> 2) This seems reasonable, but it’s a bit of a nuisance to require
>>>>>> everything to be a URI here. It’s long and ugly, and a lot of APIs are
>>>>>> going to be internal to a given group, deployment, or ecosystem anyway.
>>>>>> This makes sense when you’ve got something reusable across many
>>>>>> deployments, like OIDC, but it’s overhead when what you’re doing is tied 
>>>>>> to
>>>>>> your environment.
>>>>>>
>>>>>> 3) This allows the AS and RS to define the request parameters for
>>>>>> their APIs just like they do today with scopes. Since it’s always the
>>>>>> combination of “this type :AT: this AS/RS”, name spacing is less of an
>>>>>> issue across systems. We haven’t seen huge problems in scope value 
>>>>>> overlap
>>>>>> in the wild, though it does occur from time to time it’s more than
>>>>>> manageable. A client isn’t going to just “speak RAR”, it’s going to be
>>>>>> speaking RAR so that it can access something in particular.
>>>>>>
>>>>>> And all that brings me to my proposal:
>>>>>>
>>>>>> 4) Require all values to be defined by the AS, and encourage
>>>>>> specification developers to use URIs for collision resistance.
>>>>>>
>>>>>> So officially in RAR, the AS would decide what “type” means, and
>>>>>> nobody else. But we can also guide people who are developing
>>>>>> general-purpose interoperable APIs to use URIs for their RAR “type”
>>>>>> definitions. This would keep those interoperable APIs from stepping on 
>>>>>> each
>>>>>> other, and from stepping on any locally-defined special “type” structure.
>>>>>> But at the end of the day, the URI carries no more weight than just any
>>>>>> other string, and the AS decides what it means and how it applies.
>>>>>>
>>>>>> My argument is that this seems to have worked very, very well for
>>>>>> scopes, and the RAR “type” is cut from similar descriptive cloth.
>>>>>>
>>>>>> What does the rest of the group think? How should we manage the RAR
>>>>>> “type” values and what they mean?
>>>>>>
>>>>>>  — Justin
>>>>>> _______________________________________________
>>>>>> OAuth mailing list
>>>>>> OAuth@ietf.org
>>>>>> https://www.ietf.org/mailman/listinfo/oauth
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

_______________________________________________
OAuth mailing list
OAuth@ietf.org
https://www.ietf.org/mailman/listinfo/oauth

Re: [OAUTH-WG] Namespacing "type" in RAR

Reply via email to