Thanks JB,

I do feel like the discussion around OAuth2, SigV4, etc. is a big enough
topic that we wouldn't want to bundle it with other proposed changes.  I
think the discussion around both what is included in the spec and what the
reference implementations will be for each of these protocols will be a
rather large topic.

In general, we find fewer, more focused proposals allow for better
discussion and faster resolution.

Can you split that section out into a separate document and create an issue
for the auth changes?

Thanks,
-Dan

On Thu, May 30, 2024 at 4:55 AM Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi Jack,
>
> Here's my comments:
>
> 1. I don't think we should remove the oauth2 endpoint directly like
> this. I would first deprecate the endpoint and plan the removal in the
> spec v2.
> 4. I agree, and it has to be pluggable.
>
> I updated the REST Spec v2 proposal including first steps on v1:
>
> https://docs.google.com/document/d/1JUtFpdEoa6IAKt1EzJi_re0PUbh56XnfUtRe5WAfl0s/edit?usp=sharing
>
> As already shared on the mailing list, I'm working on a PR to have
> interfaces with JAXRS/Swagger annotations to generate OpenAPI
> JSON/YAML with the swagger-gradle-plugin.
>
> Thanks,
> Regards
> JB
>
> On Wed, May 29, 2024 at 8:03 PM Jack Ye <yezhao...@gmail.com> wrote:
> >
> > Just to reiterate my points discussed in the community sync here: the
> more I think about it the more I agree the OAuth endpoint should be removed
> from the REST spec. Even though the endpoint is optional, and even if we do
> not care about the security concerns, it still provides users an impression
> that the endpoint "should" be implemented, or "is the preferred
> authentication mechanism". And as we have found out, the server capability
> proposal does not cover this case since this is the first endpoint to hit
> before the GetConfig endpoint.
> >
> > As Ryan said, if we want to do that we need an alternative plan. I don't
> have anything concrete, but here is my line of thought:
> >
> > 1. remove OAuth2 endpoint from the "REST OpenAPI spec"
> >
> > 2. create a client-side interface (in each language) that different
> authentication mechanisms can be plugged in to talk to the REST catalog
> >
> > 3. refactor and make OAuth2 an implementation of that interface. I can
> also help with doing the same for AWS Sigv4, and the community can further
> support some additional ones like Kerberos, SAML, Google SSO, etc. based on
> the individual use cases.
> >
> > 4. turn 2 + 3 into a "REST catalog authentication spec" that documents
> about all the supported authentication mechanisms and their defaults. For
> OAuth2, the default is to have the auth server at the same endpoint as the
> resource server for backwards compatibility, but that is a configurable
> property, and we could recommend not to do that based on security concerns.
> >
> > Best,
> > Jack Ye
> >
> > On Wed, May 29, 2024 at 10:28 AM Steven Wu <stevenz...@gmail.com> wrote:
> >>
> >> Wondering if the auth endpoints can be separated out to a separate
> OpenAPI spec file. Then we still have some reference for interactions with
> auth server and make it clear it is not required as part of the REST
> catalog server. In most enterprise environments, auth server is likely a
> separate server.
> >>
> >> On Tue, May 28, 2024 at 1:25 PM Alex Dutra
> <alex.du...@dremio.com.invalid> wrote:
> >>>
> >>> Hi,
> >>>
> >>>>
> >>>> On point 4, isn't that possible today, Can't that be achieved with
> the current token exchange approach, and the internal implementation of the
> endpoint?
> >>>
> >>>
> >>> Unfortunately, no. Token exchange is not widely adopted yet: for
> example, Keycloak has only partial support for it, and Authelia, or
> Authentik, have no support for it at all.
> >>>
> >>> This, and a few other technical issues with the current internals of
> the REST client, makes it nearly impossible to achieve a good integration
> of Iceberg REST with the majority of popular OSS authorization servers.
> >>>
> >>> I am planning to start another email thread to discuss these
> practicalities, but let's first reach consensus on the broader security
> issues voiced here, before we tackle the details.
> >>>
> >>> Thanks,
> >>>
> >>> Alex Dutra
> >>>
> >>> On Tue, May 28, 2024 at 8:41 PM Amogh Jahagirdar <am...@tabular.io>
> wrote:
> >>>>
> >>>> I disagree with removing "/v1/oauth/tokens" and I think I also
> disagree with the premise that implementing that endpoint is required, but
> I can understand how that's not clear in the spec. I think we can address
> the required vs non-required discussion with the capabilities PR.
> >>>>
> >>>> It seems like another part of what's driving this discussion is some
> concern around how do we enforce REST catalog implementations which do
> implement this endpoint to make sure that the implementation is secure (for
> example to avoid the MITM example that was brought up). This is ultimately
> a runtime detail. To me it seems like if we make it clear that such an
> endpoint should be implemented respecting OAuth2 standards, and we know
> that OAuth2 compliance requires avoiding that MITM situation, then runtime
> implementations should just follow the spec there
> >>>>
> >>>> >3. Enable flexibility for Iceberg REST servers to opt for other
> >>>> authorization mechanisms than OAuth 2.0.
> >>>> >4. Enable REST servers to opt for integrating with any standard
> OAuth2 /
> >>>> OIDC provider (e.g. Okta, Keycloak, Authelia).
> >>>>
> >>>> I agree with both of these points; again I don't think the intention
> is Oauth2 is the only way, but I think the capabilities PR will make that
> even more clear.
> >>>> On point 4, isn't that possible today, Can't that be achieved with
> the current token exchange approach, and the internal implementation of the
> endpoint? Sorry if I missed that explanation.
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Amogh Jahagirdar
> >>>>
> >>>> On Tue, May 28, 2024 at 11:13 AM Yufei Gu <flyrain...@gmail.com>
> wrote:
> >>>>>
> >>>>> Not an expert on authentication, but reading from the context, I
> agree that it’s not a good practice to use a resource server as a token
> server. The resource server would need to securely handle and store
> credentials or tokens, increasing the risk of credential theft or leakage.
> Making the token endpoint optional will mitigate the issue a bit. But if we
> want to disable it completely, it's better to do it now to prevent any
> issues and migration costs in the future. Can we have a consensus on it?
> >>>>>
> >>>>>
> >>>>> I would prefer to deprecate it to prevent any intentional and
> unintentional misuse. We will also need to change the clients since it
> connects to the endpoint by default.
> >>>>>
> >>>>>
> >>>>> Yufei
> >>>>>
> >>>>>
> >>>>> On Tue, May 28, 2024 at 8:27 AM Jack Ye <yezhao...@gmail.com> wrote:
> >>>>>>
> >>>>>> Sounds like we should try to finalize a consensus around
> https://github.com/apache/iceberg/pull/9940, so that we make it very
> clear what APIs/features are optional.
> >>>>>>
> >>>>>> -Jack
> >>>>>>
> >>>>>> On Tue, May 28, 2024 at 7:25 AM Fokko Driesprong <fo...@apache.org>
> wrote:
> >>>>>>>
> >>>>>>> Hey Robert,
> >>>>>>>
> >>>>>>> Sorry for the late reply as I was out last week. I'm not an OAuth
> guru either, but some context from my end.
> >>>>>>>
> >>>>>>>> * Credentials (for example username/password) must _never_ be
> sent to
> >>>>>>>> the resource server, only to the authorization server.
> >>>>>>>
> >>>>>>>
> >>>>>>> In an earlier discussion, it was agreed that the resource server
> can also function as the authorization server. But the roles can also be
> separate.
> >>>>>>>
> >>>>>>>> 1.2. As long as OAuth2 is the only mechanism supported by the
> Iceberg
> >>>>>>>> client, make the existing client parameter “oauth2-server-uri”
> >>>>>>>> mandatory. The Iceberg REST catalog must fail to initialize if the
> >>>>>>>> “oauth2-server-uri” parameter is not defined.
> >>>>>>>
> >>>>>>>
> >>>>>>> It can also be that there is no authentication in the case of an
> internal REST catalog. For example, the iceberg-rest-image that we use for
> integration tests in PyIceberg.
> >>>>>>>
> >>>>>>>> We think that Apache Iceberg REST Catalog spec should not mandate
> that a
> >>>>>>>> catalog implementation responds to requests to produce Auth Tokens
> >>>>>>>> (since the REST spec v1 defines a /v1/tokens endpoint, current
> >>>>>>>> implementations have to take deliberate actions when responding
> to those
> >>>>>>>> requests, whether with successful token responses or with “access
> >>>>>>>> denied” or “unsupported” responses).
> >>>>>>>
> >>>>>>> The `/v1/tokens` endpoint is optional.
> >>>>>>>
> >>>>>>>> * Credentials (for example username/password) must _never_ be
> sent to
> >>>>>>>> the resource server, only to the authorization server.
> >>>>>>>
> >>>>>>>
> >>>>>>> I fully agree!
> >>>>>>>
> >>>>>>>> Even if an Iceberg REST server does not implement the
> ‘/v1/oauth/tokens’
> >>>>>>>> endpoint, it can still receive requests to ‘/v1/oauth/tokens’
> containing
> >>>>>>>> clear text credentials, if clients are misconfigured (humans do
> make
> >>>>>>>> mistakes) - it’s a non-zero risk - bad actors can
> implement/intercept
> >>>>>>>> that  ‘/v1/oauth/tokens’ endpoint and just wait for misconfigured
> >>>>>>>> clients to send credentials.
> >>>>>>>
> >>>>>>>
> >>>>>>> I think the wording is chosen badly. It should not send any
> credentials, but the code (as in this example by GCS).
> >>>>>>>
> >>>>>>>> I think Jack makes a good point with AWS SigV4 Authentication. I
> suppose, in REST Catalog implementations that support that auth method, the
> /v1/oauth/token Catalog REST endpoint is redundant.
> >>>>>>>
> >>>>>>>
> >>>>>>> There are other cloud providers next to AWS.
> >>>>>>>
> >>>>>>> Kind regards,
> >>>>>>> Fokko
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Op do 23 mei 2024 om 15:49 schreef Dmitri Bourlatchkov
> <dmitri.bourlatch...@dremio.com.invalid>:
> >>>>>>>>
> >>>>>>>> I think Jack makes a good point with AWS SigV4 Authentication. I
> suppose, in REST Catalog implementations that support that auth method, the
> /v1/oauth/token Catalog REST endpoint is redundant.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Dmitri.
> >>>>>>>>
> >>>>>>>> On Thu, May 23, 2024 at 9:20 AM Jack Ye <yezhao...@gmail.com>
> wrote:
> >>>>>>>>>
> >>>>>>>>> I do not know enough details about OAuth to make comments about
> this issue, but just regarding the statement "OAuth2 is the only mechanism
> supported by the Iceberg client", AWS Sigv4 auth is also supported, at
> least in the Java client implementation. It would be nice if we formalize
> that in the spec, at least define it as a generic authorization header.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Jack Ye
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Thu, May 23, 2024 at 2:51 AM Robert Stupp <sn...@snazy.de>
> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> Iceberg REST implementations, either accessible on the public
> internet
> >>>>>>>>>> or inside an organization, are usually being secured using
> appropriate
> >>>>>>>>>> authorization mechanisms. The Nessie team is looking at
> implementing the
> >>>>>>>>>> Iceberg REST specification and have some questions around the
> security
> >>>>>>>>>> endpoint(s) defined in the spec.
> >>>>>>>>>>
> >>>>>>>>>> TL;DR we have questions (potentially concerns) about having the
> >>>>>>>>>> ‘/v1/oauth/tokens’ endpoint, for the reasons explained below.
> We think
> >>>>>>>>>> that ‘/v1/oauth/tokens’ poses potential security and OAuth2
> compliance
> >>>>>>>>>> issues, and imposes how authorization should be implemented.
> >>>>>>>>>> * As an open table format, it would be good for Iceberg to
> focus on the
> >>>>>>>>>> table format / catalog and not how authorization is
> implemented. The
> >>>>>>>>>> existence of an OAuth endpoint pushes implementations to adopt
> >>>>>>>>>> authorization using only OAuth, whereas the implementers might
> choose
> >>>>>>>>>> several other ways to implement authorization (e.g. SAML). In
> our
> >>>>>>>>>> opinion the spec should leave it open to the implementation to
> decide
> >>>>>>>>>> how authorization will be implemented.
> >>>>>>>>>> * The existence of that endpoint also pushes operators of
> Iceberg REST
> >>>>>>>>>> endpoints into the authorization service business.
> >>>>>>>>>> * Clients might expose their clear-text credentials to the wrong
> >>>>>>>>>> service, if the (correct) OAuth endpoint is not configured
> (humans do
> >>>>>>>>>> make mistakes).
> >>>>>>>>>> * (Naive) Iceberg REST servers may proxy requests received for
> >>>>>>>>>> ‘/v1/oauth/tokens’ - and effectively become a
> “man-in-the-middle”, which
> >>>>>>>>>> is not fully compliant with the OAuth 2.0 specification.
> >>>>>>>>>>
> >>>>>>>>>> Our goals with this discussion are:
> >>>>>>>>>> 1. Secure the Iceberg REST specification by preventing
> accidental
> >>>>>>>>>> misuse/misimplementation.
> >>>>>>>>>> 2. Prevent that Iceberg REST to get into dictating the
> “authorization
> >>>>>>>>>> server specifics”.
> >>>>>>>>>> 3. Enable flexibility for Iceberg REST servers to opt for other
> >>>>>>>>>> authorization mechanisms than OAuth 2.0.
> >>>>>>>>>> 4. Enable REST servers to opt for integrating with any standard
> OAuth2 /
> >>>>>>>>>> OIDC provider (e.g. Okta, Keycloak, Authelia).
> >>>>>>>>>>
> >>>>>>>>>> OAuth 2.0 [1] is one of the common standards accepted in the
> industry.
> >>>>>>>>>> It defines a secure mechanism to access resources (here:
> Iceberg REST
> >>>>>>>>>> endpoints). The most important aspect for OAuth 2.0 resources
> is that
> >>>>>>>>>> (Iceberg REST) servers do not (have to) support password
> authentication,
> >>>>>>>>>> especially considering the security weaknesses inherent in
> passwords.
> >>>>>>>>>> Compromised passwords result in compromised data protected by
> that password.
> >>>>>>>>>>
> >>>>>>>>>> Therefore OAuth 2.0 defines a set of strict rules. Some of
> these are:
> >>>>>>>>>> * Credentials (for example username/password) must _never_ be
> sent to
> >>>>>>>>>> the resource server, only to the authorization server.
> >>>>>>>>>> * OAuth 2.0 refresh tokens must _never_ be sent to the resource
> server,
> >>>>>>>>>> only to the authorization server. (“Unlike access tokens,
> refresh tokens
> >>>>>>>>>> are intended for use only with authorization servers and are
> never sent
> >>>>>>>>>> to resource servers.”, cite from section 1.5 of the OAuth RFC
> 6749.)
> >>>>>>>>>> * While the OAuth RFC states "The authorization server may be
> the same
> >>>>>>>>>> server as the resource server or a separate entity", this
> should not be
> >>>>>>>>>> mandated. i.e the spec should be open to supporting
> implementations that
> >>>>>>>>>> have the authorization server and resource server co-located as
> well as
> >>>>>>>>>> separate.
> >>>>>>>>>>
> >>>>>>>>>> The Iceberg PR 4771 [2] added the OpenAPI path
> ‘/v1/oauth/tokens’,
> >>>>>>>>>> intentionally marked to “To exchange client credentials (client
> ID and
> >>>>>>>>>> secret) for an access token. This uses the client credentials
> flow.”
> >>>>>>>>>> [3]. Technically: client ID and secret are submitted using a
> HTTP POST
> >>>>>>>>>> request to that Iceberg REST endpoint.
> >>>>>>>>>>
> >>>>>>>>>> Having ‘/v1/oauth/tokens’ in the Iceberg REST specification can
> easily
> >>>>>>>>>> be seen as a hard requirement. In order to implement this in
> compliance
> >>>>>>>>>> with the OAuth 2.0 spec, that ‘/v1/oauth/tokens’ MUST be the
> >>>>>>>>>> authorization server. If users do not (want to) implement an
> >>>>>>>>>> authorization server, the only way to implement this
> ‘/v1/oauth/tokens’
> >>>>>>>>>> endpoint would be to proxy ‘/v1/oauth/tokens’ to the actual
> >>>>>>>>>> authorization server, which means, that this proxy technically
> becomes a
> >>>>>>>>>> “man in the middle” - knowing both all credentials and all
> involved tokens.
> >>>>>>>>>>
> >>>>>>>>>> Even if an Iceberg REST server does not implement the
> ‘/v1/oauth/tokens’
> >>>>>>>>>> endpoint, it can still receive requests to ‘/v1/oauth/tokens’
> containing
> >>>>>>>>>> clear text credentials, if clients are misconfigured (humans do
> make
> >>>>>>>>>> mistakes) - it’s a non-zero risk - bad actors can
> implement/intercept
> >>>>>>>>>> that  ‘/v1/oauth/tokens’ endpoint and just wait for
> misconfigured
> >>>>>>>>>> clients to send credentials.
> >>>>>>>>>>
> >>>>>>>>>> Further usages of the REST Catalog API path ‘/v1/oauth/tokens’
> are “To
> >>>>>>>>>> exchange a client token and an identity token for a more
> specific access
> >>>>>>>>>> token. This uses the token exchange flow.” and “To exchange an
> access
> >>>>>>>>>> token for one with the same claims and a refreshed expiration
> period
> >>>>>>>>>> This uses the token exchange flow.” Both usages should and can
> be
> >>>>>>>>>> implemented differently.
> >>>>>>>>>>
> >>>>>>>>>> Apache Iceberg, as a table format project, should recommend
> protecting
> >>>>>>>>>> sensitive information. But Iceberg should not mandate _how_ that
> >>>>>>>>>> protection is implemented - but the Iceberg REST specification
> does
> >>>>>>>>>> effectively mandate OAuth 2.0, because other Iceberg REST
> endpoints do
> >>>>>>>>>> refer/require OAuth 2.0 specifics. Users that want to use other
> >>>>>>>>>> mechanisms, because they are forced to do so by their
> organization,
> >>>>>>>>>> would be locked out of Iceberg REST.
> >>>>>>>>>>
> >>>>>>>>>> Apache Iceberg should not mandate OAuth 2.0 as the only option
> - for the
> >>>>>>>>>> sake of openness for the project and flexibility for the server
> >>>>>>>>>> implementations.
> >>>>>>>>>>
> >>>>>>>>>> We think that Apache Iceberg REST Catalog spec should not
> mandate that a
> >>>>>>>>>> catalog implementation responds to requests to produce Auth
> Tokens
> >>>>>>>>>> (since the REST spec v1 defines a /v1/tokens endpoint, current
> >>>>>>>>>> implementations have to take deliberate actions when responding
> to those
> >>>>>>>>>> requests, whether with successful token responses or with
> “access
> >>>>>>>>>> denied” or “unsupported” responses).
> >>>>>>>>>>
> >>>>>>>>>> We propose the following actions:
> >>>>>>>>>> 1. Immediate mitigation:
> >>>>>>>>>> 1.1. Remove the ‘/v1/oauth/tokens’ endpoint entirely from the
> Iceberg’s
> >>>>>>>>>> OpenAPI spec w/o replacement.
> >>>>>>>>>> 1.2. As long as OAuth2 is the only mechanism supported by the
> Iceberg
> >>>>>>>>>> client, make the existing client parameter “oauth2-server-uri”
> >>>>>>>>>> mandatory. The Iceberg REST catalog must fail to initialize if
> the
> >>>>>>>>>> “oauth2-server-uri” parameter is not defined.
> >>>>>>>>>> 1.3. Remove all fallbacks to the ‘/v1/oauth/tokens’ endpoint.
> >>>>>>>>>> 1.4. Forbid or discourage the communication of tokens from any
> Iceberg
> >>>>>>>>>> REST Catalog endpoint, both via the "token" property or with
> any of the
> >>>>>>>>>> "urn:ietf:params:oauth:token-type:*" properties.
> >>>>>>>>>> 2. As a follow up: We’d propose a couple of implementation
> fixes and
> >>>>>>>>>> changes and test improvements.
> >>>>>>>>>> 3. As a follow up: Define a discovery mechanism for both the
> Iceberg
> >>>>>>>>>> REST base URI and OAuth 2.0 endpoints/discovery, which allows
> users to
> >>>>>>>>>> use a single URI to securely access Iceberg REST endpoints.
> >>>>>>>>>> 4. As a follow up: Not new, but we also want to improve the
> Iceberg REST
> >>>>>>>>>> specification via the “new” REST proposal.
> >>>>>>>>>>
> >>>>>>>>>> We do not think that adding recommendations to
> inline-documentation is
> >>>>>>>>>> enough to fully mitigate the above concerns.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> References:
> >>>>>>>>>>
> >>>>>>>>>> [1] RFC 6749 - The OAuth 2.0 Authorization Framework,
> >>>>>>>>>> https://datatracker.ietf.org/doc/html/rfc6749
> >>>>>>>>>> [2] Iceberg pull request 4771 - Core: Add OAuth2 to REST
> catalog spec -
> >>>>>>>>>> https://github.com/apache/iceberg/pull/4771
> >>>>>>>>>> [3] Iceberg pull request 4843 - Spec: Add more context about
> OAuth2 to
> >>>>>>>>>> the REST spec - https://github.com/apache/iceberg/pull/4843
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Robert Stupp
> >>>>>>>>>> @snazy
> >>>>>>>>>>
>

Reply via email to