Seems like this thread did not get much attention?

> * (Naive) Iceberg REST servers may proxy requests received for 
> '/v1/oauth/tokens’
- and effectively become a “man-in-the-middle”, which is not fully
compliant with the OAuth 2.0 specification.

This seems like a concern to me, could any REST OAuth users describe a bit
about how they have the implementation done and if it is a MITM setup as
Robert describes?

-Jack

On Thu, May 23, 2024 at 6:49 AM Dmitri Bourlatchkov
<dmitri.bourlatch...@dremio.com.invalid> wrote:

> I think Jack makes a good point with AWS SigV4 Authentication. I suppose,
> in REST Catalog implementations that support that auth method, the
> /v1/oauth/token Catalog REST endpoint is redundant.
>
> Cheers,
> Dmitri.
>
> On Thu, May 23, 2024 at 9:20 AM Jack Ye <yezhao...@gmail.com> wrote:
>
>> I do not know enough details about OAuth to make comments about this
>> issue, but just regarding the statement "OAuth2 is the only mechanism
>> supported by the Iceberg client", AWS Sigv4 auth is also supported, at
>> least in the Java client implementation
>> <https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/rest/HTTPClient.java#L72>.
>> It would be nice if we formalize that in the spec, at least define it as a
>> generic authorization header.
>>
>> Best,
>> Jack Ye
>>
>>
>>
>> On Thu, May 23, 2024 at 2:51 AM Robert Stupp <sn...@snazy.de> wrote:
>>
>>> Hi all,
>>>
>>> Iceberg REST implementations, either accessible on the public internet
>>> or inside an organization, are usually being secured using appropriate
>>> authorization mechanisms. The Nessie team is looking at implementing the
>>> Iceberg REST specification and have some questions around the security
>>> endpoint(s) defined in the spec.
>>>
>>> TL;DR we have questions (potentially concerns) about having the
>>> ‘/v1/oauth/tokens’ endpoint, for the reasons explained below. We think
>>> that ‘/v1/oauth/tokens’ poses potential security and OAuth2 compliance
>>> issues, and imposes how authorization should be implemented.
>>> * As an open table format, it would be good for Iceberg to focus on the
>>> table format / catalog and not how authorization is implemented. The
>>> existence of an OAuth endpoint pushes implementations to adopt
>>> authorization using only OAuth, whereas the implementers might choose
>>> several other ways to implement authorization (e.g. SAML). In our
>>> opinion the spec should leave it open to the implementation to decide
>>> how authorization will be implemented.
>>> * The existence of that endpoint also pushes operators of Iceberg REST
>>> endpoints into the authorization service business.
>>> * Clients might expose their clear-text credentials to the wrong
>>> service, if the (correct) OAuth endpoint is not configured (humans do
>>> make mistakes).
>>> * (Naive) Iceberg REST servers may proxy requests received for
>>> ‘/v1/oauth/tokens’ - and effectively become a “man-in-the-middle”, which
>>> is not fully compliant with the OAuth 2.0 specification.
>>>
>>> Our goals with this discussion are:
>>> 1. Secure the Iceberg REST specification by preventing accidental
>>> misuse/misimplementation.
>>> 2. Prevent that Iceberg REST to get into dictating the “authorization
>>> server specifics”.
>>> 3. Enable flexibility for Iceberg REST servers to opt for other
>>> authorization mechanisms than OAuth 2.0.
>>> 4. Enable REST servers to opt for integrating with any standard OAuth2 /
>>> OIDC provider (e.g. Okta, Keycloak, Authelia).
>>>
>>> OAuth 2.0 [1] is one of the common standards accepted in the industry.
>>> It defines a secure mechanism to access resources (here: Iceberg REST
>>> endpoints). The most important aspect for OAuth 2.0 resources is that
>>> (Iceberg REST) servers do not (have to) support password authentication,
>>> especially considering the security weaknesses inherent in passwords.
>>> Compromised passwords result in compromised data protected by that
>>> password.
>>>
>>> Therefore OAuth 2.0 defines a set of strict rules. Some of these are:
>>> * Credentials (for example username/password) must _never_ be sent to
>>> the resource server, only to the authorization server.
>>> * OAuth 2.0 refresh tokens must _never_ be sent to the resource server,
>>> only to the authorization server. (“Unlike access tokens, refresh tokens
>>> are intended for use only with authorization servers and are never sent
>>> to resource servers.”, cite from section 1.5 of the OAuth RFC 6749.)
>>> * While the OAuth RFC states "The authorization server may be the same
>>> server as the resource server or a separate entity", this should not be
>>> mandated. i.e the spec should be open to supporting implementations that
>>> have the authorization server and resource server co-located as well as
>>> separate.
>>>
>>> The Iceberg PR 4771 [2] added the OpenAPI path ‘/v1/oauth/tokens’,
>>> intentionally marked to “To exchange client credentials (client ID and
>>> secret) for an access token. This uses the client credentials flow.”
>>> [3]. Technically: client ID and secret are submitted using a HTTP POST
>>> request to that Iceberg REST endpoint.
>>>
>>> Having ‘/v1/oauth/tokens’ in the Iceberg REST specification can easily
>>> be seen as a hard requirement. In order to implement this in compliance
>>> with the OAuth 2.0 spec, that ‘/v1/oauth/tokens’ MUST be the
>>> authorization server. If users do not (want to) implement an
>>> authorization server, the only way to implement this ‘/v1/oauth/tokens’
>>> endpoint would be to proxy ‘/v1/oauth/tokens’ to the actual
>>> authorization server, which means, that this proxy technically becomes a
>>> “man in the middle” - knowing both all credentials and all involved
>>> tokens.
>>>
>>> Even if an Iceberg REST server does not implement the ‘/v1/oauth/tokens’
>>> endpoint, it can still receive requests to ‘/v1/oauth/tokens’ containing
>>> clear text credentials, if clients are misconfigured (humans do make
>>> mistakes) - it’s a non-zero risk - bad actors can implement/intercept
>>> that  ‘/v1/oauth/tokens’ endpoint and just wait for misconfigured
>>> clients to send credentials.
>>>
>>> Further usages of the REST Catalog API path ‘/v1/oauth/tokens’ are “To
>>> exchange a client token and an identity token for a more specific access
>>> token. This uses the token exchange flow.” and “To exchange an access
>>> token for one with the same claims and a refreshed expiration period
>>> This uses the token exchange flow.” Both usages should and can be
>>> implemented differently.
>>>
>>> Apache Iceberg, as a table format project, should recommend protecting
>>> sensitive information. But Iceberg should not mandate _how_ that
>>> protection is implemented - but the Iceberg REST specification does
>>> effectively mandate OAuth 2.0, because other Iceberg REST endpoints do
>>> refer/require OAuth 2.0 specifics. Users that want to use other
>>> mechanisms, because they are forced to do so by their organization,
>>> would be locked out of Iceberg REST.
>>>
>>> Apache Iceberg should not mandate OAuth 2.0 as the only option - for the
>>> sake of openness for the project and flexibility for the server
>>> implementations.
>>>
>>> We think that Apache Iceberg REST Catalog spec should not mandate that a
>>> catalog implementation responds to requests to produce Auth Tokens
>>> (since the REST spec v1 defines a /v1/tokens endpoint, current
>>> implementations have to take deliberate actions when responding to those
>>> requests, whether with successful token responses or with “access
>>> denied” or “unsupported” responses).
>>>
>>> We propose the following actions:
>>> 1. Immediate mitigation:
>>> 1.1. Remove the ‘/v1/oauth/tokens’ endpoint entirely from the Iceberg’s
>>> OpenAPI spec w/o replacement.
>>> 1.2. As long as OAuth2 is the only mechanism supported by the Iceberg
>>> client, make the existing client parameter “oauth2-server-uri”
>>> mandatory. The Iceberg REST catalog must fail to initialize if the
>>> “oauth2-server-uri” parameter is not defined.
>>> 1.3. Remove all fallbacks to the ‘/v1/oauth/tokens’ endpoint.
>>> 1.4. Forbid or discourage the communication of tokens from any Iceberg
>>> REST Catalog endpoint, both via the "token" property or with any of the
>>> "urn:ietf:params:oauth:token-type:*" properties.
>>> 2. As a follow up: We’d propose a couple of implementation fixes and
>>> changes and test improvements.
>>> 3. As a follow up: Define a discovery mechanism for both the Iceberg
>>> REST base URI and OAuth 2.0 endpoints/discovery, which allows users to
>>> use a single URI to securely access Iceberg REST endpoints.
>>> 4. As a follow up: Not new, but we also want to improve the Iceberg REST
>>> specification via the “new” REST proposal.
>>>
>>> We do not think that adding recommendations to inline-documentation is
>>> enough to fully mitigate the above concerns.
>>>
>>>
>>> References:
>>>
>>> [1] RFC 6749 - The OAuth 2.0 Authorization Framework,
>>> https://datatracker.ietf.org/doc/html/rfc6749
>>> [2] Iceberg pull request 4771 - Core: Add OAuth2 to REST catalog spec -
>>> https://github.com/apache/iceberg/pull/4771
>>> [3] Iceberg pull request 4843 - Spec: Add more context about OAuth2 to
>>> the REST spec - https://github.com/apache/iceberg/pull/4843
>>>
>>> --
>>> Robert Stupp
>>> @snazy
>>>
>>>

Reply via email to