Seems like this thread did not get much attention? > * (Naive) Iceberg REST servers may proxy requests received for > '/v1/oauth/tokens’ - and effectively become a “man-in-the-middle”, which is not fully compliant with the OAuth 2.0 specification.
This seems like a concern to me, could any REST OAuth users describe a bit about how they have the implementation done and if it is a MITM setup as Robert describes? -Jack On Thu, May 23, 2024 at 6:49 AM Dmitri Bourlatchkov <dmitri.bourlatch...@dremio.com.invalid> wrote: > I think Jack makes a good point with AWS SigV4 Authentication. I suppose, > in REST Catalog implementations that support that auth method, the > /v1/oauth/token Catalog REST endpoint is redundant. > > Cheers, > Dmitri. > > On Thu, May 23, 2024 at 9:20 AM Jack Ye <yezhao...@gmail.com> wrote: > >> I do not know enough details about OAuth to make comments about this >> issue, but just regarding the statement "OAuth2 is the only mechanism >> supported by the Iceberg client", AWS Sigv4 auth is also supported, at >> least in the Java client implementation >> <https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/rest/HTTPClient.java#L72>. >> It would be nice if we formalize that in the spec, at least define it as a >> generic authorization header. >> >> Best, >> Jack Ye >> >> >> >> On Thu, May 23, 2024 at 2:51 AM Robert Stupp <sn...@snazy.de> wrote: >> >>> Hi all, >>> >>> Iceberg REST implementations, either accessible on the public internet >>> or inside an organization, are usually being secured using appropriate >>> authorization mechanisms. The Nessie team is looking at implementing the >>> Iceberg REST specification and have some questions around the security >>> endpoint(s) defined in the spec. >>> >>> TL;DR we have questions (potentially concerns) about having the >>> ‘/v1/oauth/tokens’ endpoint, for the reasons explained below. We think >>> that ‘/v1/oauth/tokens’ poses potential security and OAuth2 compliance >>> issues, and imposes how authorization should be implemented. >>> * As an open table format, it would be good for Iceberg to focus on the >>> table format / catalog and not how authorization is implemented. The >>> existence of an OAuth endpoint pushes implementations to adopt >>> authorization using only OAuth, whereas the implementers might choose >>> several other ways to implement authorization (e.g. SAML). In our >>> opinion the spec should leave it open to the implementation to decide >>> how authorization will be implemented. >>> * The existence of that endpoint also pushes operators of Iceberg REST >>> endpoints into the authorization service business. >>> * Clients might expose their clear-text credentials to the wrong >>> service, if the (correct) OAuth endpoint is not configured (humans do >>> make mistakes). >>> * (Naive) Iceberg REST servers may proxy requests received for >>> ‘/v1/oauth/tokens’ - and effectively become a “man-in-the-middle”, which >>> is not fully compliant with the OAuth 2.0 specification. >>> >>> Our goals with this discussion are: >>> 1. Secure the Iceberg REST specification by preventing accidental >>> misuse/misimplementation. >>> 2. Prevent that Iceberg REST to get into dictating the “authorization >>> server specifics”. >>> 3. Enable flexibility for Iceberg REST servers to opt for other >>> authorization mechanisms than OAuth 2.0. >>> 4. Enable REST servers to opt for integrating with any standard OAuth2 / >>> OIDC provider (e.g. Okta, Keycloak, Authelia). >>> >>> OAuth 2.0 [1] is one of the common standards accepted in the industry. >>> It defines a secure mechanism to access resources (here: Iceberg REST >>> endpoints). The most important aspect for OAuth 2.0 resources is that >>> (Iceberg REST) servers do not (have to) support password authentication, >>> especially considering the security weaknesses inherent in passwords. >>> Compromised passwords result in compromised data protected by that >>> password. >>> >>> Therefore OAuth 2.0 defines a set of strict rules. Some of these are: >>> * Credentials (for example username/password) must _never_ be sent to >>> the resource server, only to the authorization server. >>> * OAuth 2.0 refresh tokens must _never_ be sent to the resource server, >>> only to the authorization server. (“Unlike access tokens, refresh tokens >>> are intended for use only with authorization servers and are never sent >>> to resource servers.”, cite from section 1.5 of the OAuth RFC 6749.) >>> * While the OAuth RFC states "The authorization server may be the same >>> server as the resource server or a separate entity", this should not be >>> mandated. i.e the spec should be open to supporting implementations that >>> have the authorization server and resource server co-located as well as >>> separate. >>> >>> The Iceberg PR 4771 [2] added the OpenAPI path ‘/v1/oauth/tokens’, >>> intentionally marked to “To exchange client credentials (client ID and >>> secret) for an access token. This uses the client credentials flow.” >>> [3]. Technically: client ID and secret are submitted using a HTTP POST >>> request to that Iceberg REST endpoint. >>> >>> Having ‘/v1/oauth/tokens’ in the Iceberg REST specification can easily >>> be seen as a hard requirement. In order to implement this in compliance >>> with the OAuth 2.0 spec, that ‘/v1/oauth/tokens’ MUST be the >>> authorization server. If users do not (want to) implement an >>> authorization server, the only way to implement this ‘/v1/oauth/tokens’ >>> endpoint would be to proxy ‘/v1/oauth/tokens’ to the actual >>> authorization server, which means, that this proxy technically becomes a >>> “man in the middle” - knowing both all credentials and all involved >>> tokens. >>> >>> Even if an Iceberg REST server does not implement the ‘/v1/oauth/tokens’ >>> endpoint, it can still receive requests to ‘/v1/oauth/tokens’ containing >>> clear text credentials, if clients are misconfigured (humans do make >>> mistakes) - it’s a non-zero risk - bad actors can implement/intercept >>> that ‘/v1/oauth/tokens’ endpoint and just wait for misconfigured >>> clients to send credentials. >>> >>> Further usages of the REST Catalog API path ‘/v1/oauth/tokens’ are “To >>> exchange a client token and an identity token for a more specific access >>> token. This uses the token exchange flow.” and “To exchange an access >>> token for one with the same claims and a refreshed expiration period >>> This uses the token exchange flow.” Both usages should and can be >>> implemented differently. >>> >>> Apache Iceberg, as a table format project, should recommend protecting >>> sensitive information. But Iceberg should not mandate _how_ that >>> protection is implemented - but the Iceberg REST specification does >>> effectively mandate OAuth 2.0, because other Iceberg REST endpoints do >>> refer/require OAuth 2.0 specifics. Users that want to use other >>> mechanisms, because they are forced to do so by their organization, >>> would be locked out of Iceberg REST. >>> >>> Apache Iceberg should not mandate OAuth 2.0 as the only option - for the >>> sake of openness for the project and flexibility for the server >>> implementations. >>> >>> We think that Apache Iceberg REST Catalog spec should not mandate that a >>> catalog implementation responds to requests to produce Auth Tokens >>> (since the REST spec v1 defines a /v1/tokens endpoint, current >>> implementations have to take deliberate actions when responding to those >>> requests, whether with successful token responses or with “access >>> denied” or “unsupported” responses). >>> >>> We propose the following actions: >>> 1. Immediate mitigation: >>> 1.1. Remove the ‘/v1/oauth/tokens’ endpoint entirely from the Iceberg’s >>> OpenAPI spec w/o replacement. >>> 1.2. As long as OAuth2 is the only mechanism supported by the Iceberg >>> client, make the existing client parameter “oauth2-server-uri” >>> mandatory. The Iceberg REST catalog must fail to initialize if the >>> “oauth2-server-uri” parameter is not defined. >>> 1.3. Remove all fallbacks to the ‘/v1/oauth/tokens’ endpoint. >>> 1.4. Forbid or discourage the communication of tokens from any Iceberg >>> REST Catalog endpoint, both via the "token" property or with any of the >>> "urn:ietf:params:oauth:token-type:*" properties. >>> 2. As a follow up: We’d propose a couple of implementation fixes and >>> changes and test improvements. >>> 3. As a follow up: Define a discovery mechanism for both the Iceberg >>> REST base URI and OAuth 2.0 endpoints/discovery, which allows users to >>> use a single URI to securely access Iceberg REST endpoints. >>> 4. As a follow up: Not new, but we also want to improve the Iceberg REST >>> specification via the “new” REST proposal. >>> >>> We do not think that adding recommendations to inline-documentation is >>> enough to fully mitigate the above concerns. >>> >>> >>> References: >>> >>> [1] RFC 6749 - The OAuth 2.0 Authorization Framework, >>> https://datatracker.ietf.org/doc/html/rfc6749 >>> [2] Iceberg pull request 4771 - Core: Add OAuth2 to REST catalog spec - >>> https://github.com/apache/iceberg/pull/4771 >>> [3] Iceberg pull request 4843 - Spec: Add more context about OAuth2 to >>> the REST spec - https://github.com/apache/iceberg/pull/4843 >>> >>> -- >>> Robert Stupp >>> @snazy >>> >>>