Hi Daniel, hi all, Sorry for the late reply. Here are some answers to your questions:
> I was under the impression that the AuthManager implementation was relatively > small (based on the recent work for the GCP AuthManager) These are not comparable. The GCP AuthManager is small because it only works for GCP, and thus can leverage Google auth libraries (more specifically, it uses the google-auth-library-oauth2-http artifact; and since this artifact is already a required dependency for iceberg-gcp, it doesn't bring in any extra dependency). Conversely, this AuthManager is a general-purpose AuthManager that can work with any IDP. > The broader community wasn't involved in decisions made about the > implementation That’s exactly the purpose of this donation. > "impersonation flow" which I'm not familiar with This is a feature where the manager can dynamically fetch the subject token for a token exchange, thus managing both the catalog's token and the user's token, facilitating impersonation (and delegation) use cases. Hence the name (admittedly a bit confusing). This feature is still evolving, but we received positive feedback from users and we believe it brings a lot of value – and is not something that a third-party library could do. > we need to break it into smaller contributions and figure out the appropriate > way to review and assimilate the functionality While we are open to this option, we are concerned about the potential duration of its completion. In the interim, users have expressed a need for improved OAuth2 support. Would it be possible to gain some clarity regarding the timeline for a review of this initiative? Perhaps an initial review of the current codebase could help identify and address any potential roadblocks? I can also schedule a demo of the new auth manager, if that helps. > how well the community understands the behaviors. While OAuth2 may not be familiar or palatable to most Iceberg contributors, I am confident that some of them possess the expertise to effectively review and assess the donation. > The main competency of this project isn't to implement security protocols This may be true for the GCP auth manager or for the SigV4 one – these are vendor-specific and can leverage the respective vendor's SDK. But how would we support OAuth2 in a generic way otherwise? Or Kerberos? Whether this is a competency of the project or not is debatable. Managing HTTP requests is not a main competency of this project either, and yet we have one RESTClient interface and one HTTPClient implementation, and lots of JSON parsers. The RESTClient in its current form already implies using some authentication protocol. The simple case of using static (provided via configuration) tokens does not cover real-world cases that users have expressed interest in. Accepting the Auth Manager will certainly require some extra attention to security protocols from Iceberg maintainers, but it will allow the project to support more advanced use cases. Additionally, the Auth Manager provides a path for users of the existing, deprecated “/token” endpoint to migrate to standard RFC-based OAuth flows. > Was there any exploration of leveraging other standard implementations like > Apache Oltu, Nimbus, etc. to build the implementation off of? Yes, we considered that and decided not to go down that route. For a few reasons: 1. Most OAuth libraries provide building blocks to create clients, but they are not fully-fledged clients; you still need to write code in order to glue things together [1]. 2. These libraries usually have (too?) many dependencies [2]; some of them have not been maintained for a while. And Apache Oltu is retired. In contrast, our Auth Manager only has one small dependency: auth0-jwt. 3. If you delegate to a third-party library, then you cannot share the catalog's RESTClient or Executor. The library is going to maintain its own HTTP client and executor, leading to increased resource consumption. 4. Nothing precludes us from switching to a third-party library later on (it's an implementation detail). We thought it's best to start with a self-contained project. Thanks, Alex [1]: https://connect2id.com/products/nimbus-oauth-openid-connect-sdk/guides/oauth-client-server-development [2] For Nimbus: https://central.sonatype.com/artifact/com.nimbusds/oauth2-oidc-sdk/11.26/dependencies On Thu, Jun 19, 2025 at 5:58 PM Daniel Weeks <dwe...@apache.org> wrote: > > I hadn't seen this thread before we discussed it yesterday, but since then > I've taken a look and have some reservations. > > I was under the impression that the AuthManager implementation was relatively > small (based on the recent work for the GCP AuthManager), but after taking a > look at the repo, this is far from a small contribution. > > I strongly support more robust security support (especially for OAuth2/OIDC), > but I don't feel this is going to be a small effort to introduce. The > broader community wasn't involved in decisions made about the implementation > and I see elements that give me pause (like "impersonation flow" which I'm > not familiar with and implementation details like extensions to immutables > that aren't consistent with the broader codebase). > > If we decide that we want to take this on, I feel like we need to break it > into smaller contributions and figure out the appropriate way to review and > assimilate the functionality in a way that's consistent with the rest of the > project. Due to this being security related, we should take extra > precautions around what this introduces and how well the community > understands the behaviors. > > However, looking at the complexity here relative to the approach with the > GCP, I have to question whether this is the right path overall. The main > competency of this project isn't to implement security protocols, so it's a > lot to say we want a full and complete (possibly with extensions) native > implementation of the OAuth2 specification (there are whole projects built > around that alone). > > Was there any exploration of leveraging other standard implementations like > Apache Oltu, Nimbus, etc. to build the implementation off of? > > -Dan > > On Thu, Jun 19, 2025 at 5:33 AM Alex Dutra <alex.du...@dremio.com.invalid> > wrote: >> >> Hi Ryan & JB, hi all, >> >> I think it would be easier to introduce this new manager as an >> alternative manager. This would make the migration smoother as it >> would give users time to migrate at their convenience. Besides, the >> new manager has the notion of "dialects", and can be configured to >> behave exactly like the current one (honoring the same config >> options), making the migration even easier. >> >> > Why not contribute the functionality directly to the AuthManager already >> > in Iceberg? Is this incompatible or is there a reason the current one >> > can't be extended through contributions? >> >> There are a few reasons why I believe it's not possible to extend the >> current manager indefinitely: >> >> 1. The current auth manager lives in iceberg-core; as we introduce >> more features, it will become impractical to keep it there, especially >> since some of the features will require third-party dependencies. As a >> data point: the new manager contains almost 100 Java production >> classes (not counting test classes and build scripts). >> 2. The current auth manager has some well known shortcomings, notably >> around token refreshes. It's not possible to fix that without >> introducing regressions and potentially breaking many catalog clients >> already in production. >> 3. As we introduce features like Authorization Code grant support, >> interactions with the IDP will become more complex than just a >> request-response cycle. Since most of the current logic resides in the >> OAuth2Util class, which is entirely public, it won't be an easy task >> to introduce support for such complex flows while avoiding binary >> incompatibilities. >> >> Thanks, >> Alex >> >> >> On Wed, Jun 18, 2025 at 11:35 PM Jean-Baptiste Onofré <j...@nanthrax.net> >> wrote: >> > >> > Hi >> > >> > I think it makes sense to directly add in AuthManager. I don't see >> > blockers (with some adaptations). Alex ? >> > >> > From a donation process standpoint (if accepted), I'm happy to help >> > with the SGA and IP Clearance. >> > >> > Regards >> > JB >> > >> > On Wed, Jun 18, 2025 at 9:15 PM Ryan Blue <rdb...@gmail.com> wrote: >> > > >> > > I think it would be great to bring this functionality into Iceberg. I'm >> > > curious about your plan for getting it in. It sounds like you're >> > > suggesting adding the Dremio project to the Iceberg repo and making it >> > > optional. Why not contribute the functionality directly to the >> > > AuthManager already in Iceberg? Is this incompatible or is there a >> > > reason the current one can't be extended through contributions? >> > > >> > > On Tue, Jun 17, 2025 at 11:23 AM Christian Thiel >> > > <christian.t.b...@gmail.com> wrote: >> > >> >> > >> Hey Alex, >> > >> >> > >> Thanks for the Initiative — I really appreciate the effort here! >> > >> >> > >> Having good auth compatibility in the Catalog ecosystem is key to >> > >> establish secure standards by making them easy to use. While Iceberg >> > >> should stay open to other means of Authentication, OAuth2 is the most >> > >> widely adopted interoperable auth standard, and its role in Iceberg >> > >> REST reflects that. But with human-centric flows like Auth Code (with >> > >> PKCE 😉) and Device Code missing from most standard clients, users often >> > >> default to handing out personal Client ID/secret pairs—which is really >> > >> bad from a security perspective. >> > >> >> > >> While I can’t speak to the Java details, I fully support bringing the >> > >> functionality into Iceberg. I have tested the proposed code >> > >> successfully with Spark and different IdPs, including Auth & Device >> > >> Code flows with token refresh, as well as token refresh for Client >> > >> Credential flows. >> > >> >> > >> Thanks! >> > >> >> > >> Christian >> > >> >> > >> >> > >> >> > >> On Mon, 16 Jun 2025 at 20:33, Alex Dutra >> > >> <alex.du...@dremio.com.invalid> wrote: >> > >>> >> > >>> Hi all, >> > >>> >> > >>> Dremio recently open-sourced a new implementation of the Auth Manager >> > >>> API for OAuth2: >> > >>> >> > >>> https://github.com/dremio/iceberg-auth-manager >> > >>> >> > >>> I wrote a blog post about it a while ago [1]. >> > >>> >> > >>> Built on top of the Auth Manager API introduced in Iceberg 1.9.0, this >> > >>> project provides a more flexible and extensible OAuth2 manager >> > >>> compared to the built-in equivalent in Iceberg Core. It follows OAuth2 >> > >>> standards strictly, but also provides compatibility with any existing >> > >>> Apache Iceberg REST catalog, and contains no Dremio-specific >> > >>> functionality. To date, this is the only OAuth2 manager fully >> > >>> compliant with external identity providers. >> > >>> >> > >>> Dremio would like to contribute this code to the Apache Iceberg >> > >>> project. I am therefore initiating this discussion to determine the >> > >>> community's interest in accepting this donation. >> > >>> >> > >>> This project is beneficial to the community because it addresses >> > >>> well-known limitations, such as token refresh problems [2][3][4], and >> > >>> also because it introduces highly anticipated features like the >> > >>> Authorization Code grant support [5]. Fixing these limitations or >> > >>> adding support for such large features in the built-in manager, while >> > >>> avoiding any risk of regressions, would have been a lot harder. >> > >>> >> > >>> Also worth mentioning: this project adheres to the "Iceberg OAuth2 >> > >>> Client Authentication Guide", proposed by Christian Thiel [6]. >> > >>> >> > >>> This project could initially serve as a runtime-selectable alternative >> > >>> to the current built-in implementation. Upon reaching sufficient >> > >>> maturity however, it could potentially replace the existing manager. >> > >>> >> > >>> Please share your thoughts by replying to this email. Alternatively, >> > >>> we can discuss this topic at the Catalog Sync meeting this Wednesday, >> > >>> June 18th, if that is a more comfortable option to everyone. >> > >>> >> > >>> Thanks, >> > >>> >> > >>> Alex >> > >>> >> > >>> [1] >> > >>> https://medium.com/data-engineering-with-dremio/introducing-dremio-auth-manager-for-apache-iceberg-223827342d19 >> > >>> [2]: https://github.com/apache/iceberg/issues/12196 >> > >>> [3]: https://github.com/apache/iceberg/issues/12363 >> > >>> [4]: https://github.com/apache/iceberg/issues/13030 >> > >>> [5]: https://github.com/apache/iceberg/issues/10677 >> > >>> [6]: >> > >>> https://docs.google.com/document/d/1buW9PCNoHPeP7Br5_vZRTU-_3TExwLx6bs075gi94xc/edit?tab=t.0#heading=h.hufqidg1ij89