Re: [DISCUSS] Describing REST Server capabilities

Jack Ye Mon, 01 Jul 2024 19:44:59 -0700

Let me try to summarize what my understanding is after the sync, and we can
see if we agree:


Each operation will be versioned separately, starting at v1 which is the
current version on the spec for all operations. When there are significant
changes to the request response model, or highly backwards incompatible
changes, the version of that operation will be incremented and it will have
a new URL route at /v{nextVersion}

An IRC server must implement the getConfig operation. This operation
expresses a set of logical capabilities. Each capability could cover
certain behaviors in a set of operation versions.

The definition of new capability is based on new features proposed in IRC,
usually a new backwards incompatible feature results in a new capability
defined to control client behavior.

The server could implement a capability partially, and in that case the
server needs to document the behavior.

Each capability is also versioned. Therefore the response of getConfig
should be something like a map where key is the capability name, and value
is a set of supported versions of that capability. The capability version
is incremented when the same logical capability is updated and is backwards
incompatible.

The client looks at each capability and the supported versions, and chooses
the highest capability version it can use by default, unless overwritten by
some client side config.

What do we think about this?

-Jack








On Thu, Jun 27, 2024 at 12:53 PM Péter Váry <peter.vary.apa...@gmail.com>
wrote:

> Ignore my previous email - fat thumbed...
>
> Here is the full version:
>
> I think most of us agree that the server should announce its exact
> capabilities, so the clients don't need to guess. The debate is around how
> granular this definition should be.
>
> If we do it on service level, then the client needs to examine each and
> every service it is using whether it has the specific capability. While
> this is more flexible, I think this will become another property file on
> the client side listing all the services and versions which will be hard to
> understand/work with.
>
> Compatibility effects how often we need to define new versions, so we
> still have to touch the topic a bit:
> - If we define that the clients should be able to ignore unknown
> properties in the response, then we could decrease the cost of version
> handling on the service side, as the server doesn't need to know the exact
> client version. We push this cost on the client side. It still doesn't
> effect the number of versions as the new capabilities have to be advertised.
> - We have to be careful with services designed to be forward compatible.
> They could hide behavioral changes while technically keep the
> compatibility. Consider a new key in the map which could cause a commit
> fail. We should prefer to encode as much of the capabilities in the
> specification as possible, so we will have a new capability.
>
> Based on this, whatever we do, we should expect high number of changes to
> the api specification. For me this means that this should be human readable
> and easy to understand. I think grouping fits the description better.
>
> Thanks, Peter
>
> On Thu, Jun 27, 2024, 21:12 Péter Váry <peter.vary.apa...@gmail.com>
> wrote:
>
>> I think most of us agree that the server should announce its exact
>> capabilities, so the clients don't need to guess. The debate is around how
>> granular this definition should be.
>>
>> If we do it on service level, then the client needs to examine each and
>> every service it is using whether it has the specific capability. While
>> this is more flexible, I think this will become another property file on
>> the client side listing
>>
>> If we define that the clients should be able to ignore unknown properties
>> in the response, then we could decrease the cost of version handling on the
>> service side, as the server doesn't need to know the exact client version.
>> We push this cost on the client side.
>>
>> On Thu, Jun 27, 2024, 19:24 Robert Stupp <sn...@snazy.de> wrote:
>>
>>> IMO that would be a list of "capability" to "set of versions" tuples.
>>> The reason to have a "set of (integer) version" is that you have to plan
>>> for the future, now.
>>>
>>> I also think we do need "logical" capabilities to express for example
>>> which table/view/etc specs a service supports and to express which request
>>> /response schema (query params + request/response object params) a service
>>> supports. Not doing that will put clients in the spot of guessing whether a
>>> service supports a specific feature of an existing functionality, something
>>> that's been added after the functionality has been introduced.
>>>
>>> Whether we need a capability for each and everything ... I suspect that
>>> depends on the actual feature/functionality/change. Many things can be
>>> designed in a backwards/forwards compatible way and don't deserve a (REST)
>>> spec/capability version bump.
>>> Repeating my point that the service must really fail when it encounters
>>> an unknown attribute: Imagine there's a new table-requirement - that has to
>>> be reflected using the version of a capability. A service should really not
>>> just ignore an unknown capability, otherwise important or data correctness
>>> issues will occur.
>>>
>>> For new versions that are still in development, it's possibly the
>>> easiest to not do anything special, not even announce the new version.
>>> Whether we indicate some "beta version" or reserve "Integer.MAX_VALUE"
>>> doesn't really help anyways - client and server can have a completely
>>> different understanding of "that particular" beta version. I.e. if a client
>>> or service use "in development" functionality, it's up to them - whether it
>>> works or not.
>>>
>>> On 27.06.24 17:55, Jack Ye wrote:
>>>
>>> I feel Alex is already tapping into the more complex territory I do not
>>> want to go into, because as he says, a "capability" is logical, and it can
>>> be a set of overlapping endpoints, small features in some endpoints, etc.
>>> We already saw that in the original PR we tried to say "pagination" is a
>>> capability, but it is really just a very small feature of an endpoint, and
>>> it might also evolve on its own to extend to other endpoints in the future,
>>> and maybe one endpoint supports it and one does not...
>>>
>>> My fear is that we are getting into the business of defining things that
>>> are totally unnecessary. It takes our energy to define that and debate the
>>> boundary of each capability, but in the end what does it buy us? As Eduard
>>> says, you still need to have documentation to explain what "partial
>>> capabilities" are supported for a catalog, and people are supposed to read
>>> the documentation to not do the unsupported things.
>>>
>>> In the end, the client-server needs to understand exactly (1) what
>>> endpoints can be invoked, and (2) using what request-response schema, that
>>> is the key to me. If it means returning a response with 20-30ish hard-coded
>>> entries, and the client is configured based on that, that seems totally
>>> reasonable to me.
>>>
>>> -Jack
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jun 27, 2024 at 7:58 AM Alex Dutra
>>> <alex.du...@dremio.com.invalid> <alex.du...@dremio.com.invalid> wrote:
>>>
>>>> Hi all,
>>>>
>>>> So far we've been thinking of capabilities as equivalent to a set of
>>>> endpoints.
>>>>
>>>> That's a rather technical definition. It also brings one important
>>>> limitation: one endpoint can only be "governed" by one capability.
>>>>
>>>> Granted, most capabilities do require implementing specific endpoints.
>>>> But I wonder if, for the sake of being future-proof, we shouldn't broaden
>>>> the meaning of that term to embrace *logical* or *behavioral* concepts
>>>> as well.
>>>>
>>>> One example that comes to mind: a REST catalog implementor may choose
>>>> to implement the transactions-commit endpoint to fully comply with the
>>>> "tables" capability; but for performance reasons, or simply because it's
>>>> too complex, they could opt for rejecting multi-table commits (iow, if a
>>>> CommitTransactionRequest contains one single CommitTableRequest, that's
>>>> fine, otherwise, the endpoint would return an error). It would be nice to
>>>> express that as a capability: this way the client knows that it is safe to
>>>> call the transactions-commit endpoint, but with one CommitTableRequest at a
>>>> time.
>>>>
>>>> Such a capability would not be defined by a specific endpoint, but
>>>> rather, would influence the behavior exhibited by certain endpoints.
>>>>
>>>> Thanks,
>>>>
>>>> Alex
>>>>
>>>> On Thu, Jun 27, 2024 at 11:34 AM Jean-Baptiste Onofré <j...@nanthrax.net>
>>>> wrote:
>>>>
>>>>> Hi Jack
>>>>>
>>>>> I like Robert's proposal. Back to the topics, I think grouping with
>>>>> tags is more "flexible" (it was what we included in the REST spec
>>>>> proposal as well).
>>>>>
>>>>> Regards
>>>>> JB
>>>>>
>>>>> On Wed, Jun 26, 2024 at 6:26 PM Jack Ye <yezhao...@gmail.com> wrote:
>>>>> >
>>>>> > It seems like there are 2 sub-topics here:
>>>>> > 1. should we group operations with tags, or should we do this
>>>>> per-operation/endpoint?
>>>>> > 2. how should we do the capability/versioning for each unit (either
>>>>> per tag or per operation)
>>>>> >
>>>>> > Shall we first conclude on 1?
>>>>> >
>>>>> > For 1, my take is that we will need to do it per operation, for 2
>>>>> reasons:
>>>>> >
>>>>> > (1) There are many REST services that would only implement a very
>>>>> small set of APIs, such as just loadTable and loadView. Some will choose 
>>>>> to
>>>>> not implement very specific endpoints, such as renameTable. Tags seems
>>>>> convenient but it is mandating people to implement a specific group of 
>>>>> APIs
>>>>> together, which is a lot of burdens for especially small organizations, if
>>>>> they just want to support very specific goals like reading through IRC.
>>>>> >
>>>>> > (2) Suppose a new tag is added in the future, the server returns
>>>>> that tag, but an older client does not understand it, it might cause
>>>>> mistakes in the client's understanding of what is supported and what is
>>>>> not, when a tag contains both features in existing APIs and also new APIs.
>>>>> If we define that tags do not overlap with each other, this is probably 
>>>>> not
>>>>> a concern. However, (1) still is a problem from a usability perspective.
>>>>> >
>>>>> > Best,
>>>>> > Jack Ye
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Jun 26, 2024 at 9:02 AM Daniel Weeks <dwe...@apache.org>
>>>>> wrote:
>>>>> >>
>>>>> >> I think Robert's approach is a reasonable compromise here.
>>>>> >>
>>>>> >> If we wanted a "per operation/endpoint" versioning, I think I'd
>>>>> prefer Micah's OpenAPI spec based approach because it's more standardized,
>>>>> but I feel adds a lot of client complexity.
>>>>> >>
>>>>> >> -Dan
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Wed, Jun 26, 2024 at 6:59 AM Robert Stupp <sn...@snazy.de>
>>>>> wrote:
>>>>> >>>
>>>>> >>> (I think, compatibility deserves a separate thread - it's a "huge"
>>>>> topic)
>>>>> >>>
>>>>> >>> Based on experience, we decided on the following with Nessie:
>>>>> >>>
>>>>> >>> Unknown fields/attributes in a structure _DO_ cause
>>>>> (de)serialization failures.
>>>>> >>> "Stable API versions" - endpoint additions and/or added query
>>>>> parameters and/or enhanced structures do _NOT_ require a new API version
>>>>> (as in the endpoint's route/path).
>>>>> >>> "Flexible spec versions" - new and updated "capabilities" however
>>>>> might cause a bump in the "spec version" that the server announces in its
>>>>> `getConfig` result.
>>>>> >>>
>>>>> >>> Adding new routes/paths may require new endpoint implementations
>>>>> on the server side, which can easily lead to a lot of (unnecessarily
>>>>> boilerplate) code. Using different routes/paths is justified if the API is
>>>>> changed "fundamentally". We call the "path component" (api/v1/...,
>>>>> api/v2/...) API version - the server indicates the minimum and maximum
>>>>> supported API version, in case a client wants to "upgrade". I recommend to
>>>>> _not_ bump the API version in the route/path if it's not really necessary.
>>>>> >>>
>>>>> >>> Regarding the requirement to fail on unknown attributes: Unknown
>>>>> attributes may contain important information. A client may send a newer
>>>>> version of a request object with an important new field, but the (older)
>>>>> server discards the new attribute. Think of an attribute that for example
>>>>> defines a "commit condition" that the client expects to be respected. 
>>>>> "New"
>>>>> attributes must be omittable (e.g. don't serialize if null/default) -
>>>>> clients indicate the "usage" of an added attribute using some request
>>>>> attribute (for example: "boolean returnExtendedInformation").
>>>>> >>>
>>>>> >>> The list of capabilities can be indicated with included "spec
>>>>> versions", to tell clients which features/functionalities a server
>>>>> supports."Production" spec versions could start with 1, and "reserve" 0 
>>>>> for
>>>>> experimental/unsupported/poc kind of implementation. It could look like
>>>>> this:
>>>>> >>>   capabilities: [
>>>>> >>>     "table-spec/2,3",   // but not table-spec v1 here
>>>>> >>>     "view-spec/1",
>>>>> >>>     "table-api/1",
>>>>> >>>     "view-api/1",
>>>>> >>>     "udf-api/1",
>>>>> >>>     "super-feature/2,4,6",   // but not spec versions 0,1,3,5,7+
>>>>> >>>     ...
>>>>> >>>   ]
>>>>> >>> Incrementing a spec version in the list of capabilities doesn't
>>>>> break any client. We could also define a structure to describe each
>>>>> capability:
>>>>> >>>   components:
>>>>> >>>     schemas:
>>>>> >>>       Capability:
>>>>> >>>         name:
>>>>> >>>           type: string
>>>>> >>>           description: Name of the capability
>>>>> >>>         versions:
>>>>> >>>           type: array:
>>>>> >>>           description: List of supported spec versions of this
>>>>> capability. 0 means experimental (non-production) without any guarantees
>>>>> about the stability of schema for request and response parameters.
>>>>> >>>           items:
>>>>> >>>             type: integer
>>>>> >>>             format: int32
>>>>> >>>
>>>>> >>> In Nessie, we ensure backwards and forwards compatibility using a
>>>>> specialized test suite that runs the "in tree" client against older server
>>>>> versions and older client versions against the "in tree" server version. 
>>>>> It
>>>>> works fine for us for a few years now - and it did help preventing
>>>>> compatibility issues.
>>>>> >>>
>>>>> >>>
>>>>> >>> On 26.06.24 07:44, Péter Váry wrote:
>>>>> >>>
>>>>> >>> Hi everyone,
>>>>> >>>
>>>>> >>> A few considerations:
>>>>> >>> - I think we should explicitly state which client/service
>>>>> interoperability we are aiming for. I expect that we want to support both
>>>>> old client -> new server, and new client -> old server communications.
>>>>> >>> - I agree with Jack, that we should think about versions in
>>>>> advance - HMS tried to be backwards compatible for everything, and that
>>>>> made it hard to move forward / deprecate things.
>>>>> >>> - Still we should try to keep the backwards incompatible changes
>>>>> minimal. (All clients should be able to ignore unknown incoming fields /
>>>>> New optional input parameter should drive new features / Try to avoid 
>>>>> enums
>>>>> in responses where we expect changes (?))
>>>>> >>> - OTOH, it could be important for clients to know which of the
>>>>> backwards compatible changes are implemented for the given server - so I
>>>>> would decouple the URI from the versioning. Maybe major version change
>>>>> should (could) change the URI, but backwards compatible changes should be
>>>>> served on the same URI, but could be identified by different minor 
>>>>> versions.
>>>>> >>>
>>>>> >>> This is exciting stuff!
>>>>> >>> Thanks for pushing this forward!
>>>>> >>>
>>>>> >>> Peter
>>>>> >>>
>>>>> >>>
>>>>> >>> On Wed, Jun 26, 2024, 00:15 Jack Ye <yezhao...@gmail.com> wrote:
>>>>> >>>>
>>>>> >>>> Hi everyone,
>>>>> >>>>
>>>>> >>>> I feel I do not see a good answer to why not just simply version
>>>>> each API? When using tag, it means I have to offer capabilities per-tagged
>>>>> group. However, I could for example just offer loadTable and nothing else
>>>>> in a catalog, and that should still be Iceberg REST compliant. And I think
>>>>> we need a versioning story anyway, there is no way around it.
>>>>> >>>>
>>>>> >>>> Here is the workflow in my mind with versioning:
>>>>> >>>>
>>>>> >>>> 1. Going forward, every time the REST catalog spec introduces any
>>>>> new API endpoints or backwards incompatible changes to the existing APIs,
>>>>> the version of the specific API is incremented. So suppose the PlanTable
>>>>> API is added, this API will be at version v1. Suppose UpdateTable is
>>>>> updated with a new update type, that API will be at version v2, but
>>>>> PlanTable will remain at v1.
>>>>> >>>>
>>>>> >>>> 2. a catalog must implement getConfig. This API is the only one
>>>>> that is required.
>>>>> >>>>
>>>>> >>>> 3. in getConfig, in the defaults map (it could be in some new
>>>>> metadata structure, but since we want strong backwards compatibility
>>>>> guarantee, reusing string maps seems to be the best way), server returns
>>>>> key-value pairs of:
>>>>> >>>> - key: operation:<operationName>
>>>>> >>>> - value: version number
>>>>> >>>>
>>>>> >>>> 4. the client assumes that the map is ordered, and resolves API
>>>>> versions sequentially. For example, suppose I have the following map:
>>>>> >>>>
>>>>> >>>> { "operation:planTable": "1", "operation:loadTable": "2" }
>>>>> >>>>
>>>>> >>>> Note that by "supporting", it means to return a response in a
>>>>> predictable way that is compliant with the spec. It can also return 406
>>>>> UnsupportedOperation as a way to support it.
>>>>> >>>>
>>>>> >>>> There is also a special version *, that means any version can
>>>>> work.
>>>>> >>>>
>>>>> >>>> 5. Backwards compatibility: suppose the client is at a higher
>>>>> version than the server, then the client should always be able to
>>>>> understand the server's full list of capabilities.
>>>>> >>>>
>>>>> >>>> 6. Forward compatibility: suppose the client is at a lower
>>>>> version than the server, then the client should parse whatever operation 
>>>>> it
>>>>> understands, and use the highest version it could support to execute the
>>>>> operation. Suppose the client only supports loadTable v1, then it will
>>>>> continue to hit the GET v1/namespaces/{ns}/tables/{table} route, instead 
>>>>> of
>>>>> GET v2/namespaces/{ns}/tables/{table}. The v1 route could continue to
>>>>> support the client, or it could throw 406 to indicate that this route is
>>>>> deprecated and the client needs to upgrade.
>>>>> >>>>
>>>>> >>>> For initial backwards compatibility, I think not returning
>>>>> anything should mean that all API that the client understands are having
>>>>> version *.
>>>>> >>>>
>>>>> >>>> What do people think of it, compared to the tag approach?
>>>>> >>>>
>>>>> >>>> Best,
>>>>> >>>> Jack Ye
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> On Mon, Jun 24, 2024 at 1:42 PM Micah Kornfield <
>>>>> emkornfi...@gmail.com> wrote:
>>>>> >>>>>
>>>>> >>>>> I don't have strong opinions either way here, just thought it
>>>>> was worth raising some concerns over possible evolution here.  Some
>>>>> responses inline, but if capabilities seem to meet the requirement at 
>>>>> hand,
>>>>> then it does potentially seem the simplest mechanism.
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>>> I think we also want to avoid relyance on server specific
>>>>> published OpenAPI as they may leak other options/parameters/etc.  This may
>>>>> lead to confusion around what the canonical spec is and make clients
>>>>> incompatible if they're generated off of a non-standard spec document.
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> Yeah, I wasn't proposing necessarily using built in
>>>>> functionality but a pre-scrubbed document.  Since there is no reference
>>>>> service implementation for REST it seems like each implementor would need
>>>>> to describe the best way of scrubbing there description.
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>>>
>>>>> >>>>>> @Micah this sounds to me as if the client would then have to
>>>>> parse a bunch of endpoints to figure out whether it's safe to e.g. call
>>>>> loading a view or dropping a table on the given REST server. Rather than
>>>>> having a dedicated endpoint we're just using the /config endpoint to
>>>>> provide information about what a server supports.
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> I was not suggesting multiple endpoints here, simply different
>>>>> contents  for /config I agree in the short term this does add complexity 
>>>>> on
>>>>> the clients. But given that the canonical REST API clients are being
>>>>> developed into the standard library, I'm not sure how much toil this would
>>>>> cause in general. This also does not necessarily need to called up-front
>>>>> but could be called to verify existence vs a permission issue after an
>>>>> error was received.
>>>>> >>>>>
>>>>> >>>>> What round-trips did you have in mind here?
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>>> All good points though, but I'm not aware of a standard way to
>>>>> handle this.
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> IIUC, this sounds like a standard service description problem to
>>>>> me, the solution with capabilities appears to be one level abstraction on
>>>>> top of this.  Service discovery seems like it has been reimplemented a few
>>>>> different times depending on the technology [1][2][3]
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>>> I think versioning adds another level of complexity, but might
>>>>> be necessary since I expect these will evolve to some extent and may even
>>>>> require hitting versioned urls.
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> If there is no concrete proposal on versioning, I agree it
>>>>> probably pays to side step this.  The endpoint transitioning from list of
>>>>> strings to list of objects, would be an obvious sign to clients that they
>>>>> are out of date.  I think serving a service description(s), despite its
>>>>> complexity, is likely the most principled way of versioning items
>>>>> appropriately, but this definitely requires more in depth thought/design.
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> Thanks,
>>>>> >>>>> Micah
>>>>> >>>>>
>>>>> >>>>> [1]
>>>>> https://en.wikipedia.org/wiki/Web_Services_Description_Language
>>>>> >>>>> [2]
>>>>> https://en.wikipedia.org/wiki/Web_Application_Description_Language
>>>>> >>>>> [3] https://developers.google.com/discovery/v1/reference/apis
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> On Mon, Jun 24, 2024 at 12:42 PM Daniel Weeks <dwe...@apache.org>
>>>>> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Hey Micah,
>>>>> >>>>>>
>>>>> >>>>>> I think what we're trying to achieve is strike a balance
>>>>> between client complexity and ability to support multiple server-side
>>>>> capabilities.  One challenge we've run into is if a client performs an
>>>>> operation (e.g. listViews), but receives a 403 code, it's not clear 
>>>>> whether
>>>>> the client doesn't have access or the server doesn't support an endpoint
>>>>> but isn't sending a 404 for security reasons.  This is a simple way for 
>>>>> the
>>>>> client to understand what it should expect from the server.
>>>>> >>>>>>
>>>>> >>>>>> >  Another option would be just list all endpoints . . . and
>>>>> let clients take appropriate actions
>>>>> >>>>>> > This could be done by vending the OpenAPI spec the server
>>>>> supports at its own endpoint. I think this avoids the future problem of
>>>>> having to classify new endpoints into a specific capability.
>>>>> >>>>>>
>>>>> >>>>>> You're right that this would be the most complete way to handle
>>>>> this, but it's really complicated and may require additional "handshake"
>>>>> calls even for small interactions with the catalog service.  I think this
>>>>> puts a lot of onus on the client, when what we're describing is a set of
>>>>> endpoints that correspond to a capability.
>>>>> >>>>>>
>>>>> >>>>>> I think we also want to avoid relyance on server specific
>>>>> published OpenAPI as they may leak other options/parameters/etc.  This may
>>>>> lead to confusion around what the canonical spec is and make clients
>>>>> incompatible if they're generated off of a non-standard spec document.
>>>>> >>>>>>
>>>>> >>>>>> All good points though, but I'm not aware of a standard way to
>>>>> handle this.
>>>>> >>>>>>
>>>>> >>>>>> I think versioning adds another level of complexity, but might
>>>>> be necessary since I expect these will evolve to some extent and may even
>>>>> require hitting versioned urls.
>>>>> >>>>>>
>>>>> >>>>>> -Dan
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> On Mon, Jun 24, 2024 at 12:03 AM Eduard Tudenhöfner <
>>>>> etudenhoef...@apache.org> wrote:
>>>>> >>>>>>>
>>>>> >>>>>>> We had a separate discussion with Dan on the oauth2 flag last
>>>>> week and came to the same conclusion that removing the oauth2 capability 
>>>>> is
>>>>> probably the best for now.
>>>>> >>>>>>> This is mainly because we can't really act on the oauth2
>>>>> capability right now, because the /tokens endpoint is called before we hit
>>>>> the /config endpoint.
>>>>> >>>>>>>
>>>>> >>>>>>> > Another option would be just list all endpoints (and maybe
>>>>> even further which operations are supported) the server actually supports
>>>>> and let clients take appropriate actions (i.e. grouping could happen on 
>>>>> the
>>>>> client side).  This could be done by vending the OpenAPI spec the server
>>>>> supports at its own endpoint. I think this avoids the future problem of
>>>>> having to classify new endpoints into a specific capability.
>>>>> >>>>>>>
>>>>> >>>>>>> @Micah this sounds to me as if the client would then have to
>>>>> parse a bunch of endpoints to figure out whether it's safe to e.g. call
>>>>> loading a view or dropping a table on the given REST server. Rather than
>>>>> having a dedicated endpoint we're just using the /config endpoint to
>>>>> provide information about what a server supports.
>>>>> >>>>>>>
>>>>> >>>>>>> Thanks
>>>>> >>>>>>> Eduard
>>>>> >>>>>>>
>>>>> >>>>>>> On Fri, Jun 21, 2024 at 8:27 PM Ryan Blue
>>>>> <b...@databricks.com.invalid> <b...@databricks.com.invalid> wrote:
>>>>> >>>>>>>>
>>>>> >>>>>>>> Let's remove the oauth2 tag for now until we figure out how
>>>>> to move forward there. That makes sense to me.
>>>>> >>>>>>>>
>>>>> >>>>>>>> On Fri, Jun 21, 2024 at 9:30 AM Dmitri Bourlatchkov
>>>>> <dmitri.bourlatch...@dremio.com.invalid>
>>>>> <dmitri.bourlatch...@dremio.com.invalid> wrote:
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> Hi Eduard,
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> The capabilities PR looks good to me overall. I have a
>>>>> concern with the "oauth2" tag name though.
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> I also commented [1] in GH but the comment appears to be
>>>>> closed by default :)
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> I believe the term "oauth2" is confusing in this context
>>>>> with respect to RFC 6749 [2] as discussed in depth on another thread [3]
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> The functionality behind the /tokens endpoint is quite
>>>>> specific to the Iceberg REST spec and as the other discussion highlights,
>>>>> there are concerns with respect to OAuth2 interoperability with other
>>>>> OAuth2 servers.
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> What do you think about using a different tag name for it,
>>>>> for example "local-tokens" or "auth-tokens"?
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> Thanks,
>>>>> >>>>>>>>> Dmitri.
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> [1]
>>>>> https://github.com/apache/iceberg/pull/9940/files/15c769a52b85ac4deff5659978c7ffa7802612b0#r1649173934
>>>>> >>>>>>>>> [2] https://www.rfc-editor.org/rfc/rfc6749
>>>>> >>>>>>>>> [3]
>>>>> https://lists.apache.org/thread/twk84xx7v0xy5q5tfd9x5torgr82vv50
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> On Thu, Jun 20, 2024 at 7:28 AM Eduard Tudenhoefner <
>>>>> etudenhoef...@apache.org> wrote:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Hey everyone,
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> I'd like to bring up the discussion around describing REST
>>>>> server capabilities via the /config endpoint.
>>>>> >>>>>>>>>> There is PR #9940 that describes the OpenAPI spec changes.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Mainly we'd like to have a capabilities field in the
>>>>> ConfigResponse that allows servers to indicate to clients which
>>>>> capabilities are being supported.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> So far we have the following capabilities:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> tables
>>>>> >>>>>>>>>> views
>>>>> >>>>>>>>>> remote-signing
>>>>> >>>>>>>>>> vended-credentials
>>>>> >>>>>>>>>> multi-table-commit
>>>>> >>>>>>>>>> register-table
>>>>> >>>>>>>>>> table-metrics
>>>>> >>>>>>>>>> oauth2
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> The general idea behind a capability is that if e.g. a
>>>>> server supports views, then that server must implement all endpoints
>>>>> grouped under that capability.
>>>>> >>>>>>>>>> It's worth noting that the /config endpoint is currently
>>>>> being implicit (meaning that every REST server would have to implement 
>>>>> it).
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> One discussion point that came up during review is how we
>>>>> want to handle capabilities and backwards compatibility and what the
>>>>> default capability would be, since older servers don't know anything about
>>>>> capabilities (in such a case we could assume that the default capabilities
>>>>> would be oauth2 / tables).
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Are there any other capabilities that we'd like to include
>>>>> in the list?
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Eduard
>>>>> >>>>>>>>
>>>>> >>>>>>>>
>>>>> >>>>>>>>
>>>>> >>>>>>>> --
>>>>> >>>>>>>> Ryan Blue
>>>>> >>>>>>>> Databricks
>>>>> >>>
>>>>> >>> --
>>>>> >>> Robert Stupp
>>>>> >>> @snazy
>>>>>
>>>> --
>>> Robert Stupp
>>> @snazy
>>>
>>>

Re: [DISCUSS] Describing REST Server capabilities

Reply via email to