I'm +1 on this, though I did want to bring up a point on also achieving this via the server sending back presigned URLs for the file locations. To be clear, I don't think these are mutually exclusive approaches and like I mentioned I'm +1 on a path for leveraging catalog vended storage credentials as done in this PR; I just wanted to think through the tradeoffs.
I think the clearest benefit for the proposed approach is that many catalogs already have the mechanisms to vend credentials to clients, so this and the other change for refreshing credentials for a given plan is likely not a heavy lift for *servers *to achieve. I think the complexity will largely be on the client implementation in this approach, where we're going to have to work through some FileIO scoping challenges for a given plan. In the end, it's all doable but it is some level of complexity shifted to the client (handling the refreshing/scoping/any caching on top of that). Presigned URLs are supported by all the major object storage providers as far as I checked. Clients would have to change in order to distinguish between expected object storage URI structures and presigned URLs, but I think that overall the client side complexity for scoping is reduced compared to the credential vending approach. I think in this approach complexity is shifted to the server where the server needs to sign the objects. One could imagine at large scale of files, there's likely a lot of additional load on the server (CPU bound signing). Also later on, if there's desire to be able to extend the protocol to say "Hey read everything in this directory", then a scoped credential for that is desirable (required?). My TLDR analysis is that credential vending in scan planning is probably net better for larger scale scans, and is also a lighter lift for server implementations today while presigned URLs is probably better in terms of making it easy for a wide variety of clients to integrate. In the end, I don't think the 2 approaches are incompatible with each other and I don't see any one way doors so I think it's entirely reasonable to start with the proposed approach. Wonder what others think! Thanks, Amogh Jahagirdar On Wed, Nov 12, 2025 at 7:49 AM Eduard Tudenhöfner <[email protected]> wrote: > Hey everyone, > > For server-side scan planning we missed adding storage credentials, hence > I'm proposing to add them to the response of the */plan* endpoint. > > The OpenAPI changes can be seen in PR #14563 > <https://github.com/apache/iceberg/pull/14563>. > > Looking forward to your thoughts and feedback. > > Thanks, > Eduard >
