I have a concern about some catalogs starting to make every table
`CATALOG_ONLY`, which would essentially lock users to the catalog without
providing a way to migrate the data to another catalog.
Maybe we add a sentence in the spec to enforce, that there should be some
users where the catalog MUST provide access to the metadata files.

WDYT?

On Thu, Jan 8, 2026, 18:38 Amogh Jahagirdar <[email protected]> wrote:

> I did a pass over PR but I guess I'm a little skeptical on what notion of
> "preferences" truly gets us in the protocol. In case the endpoint is
> available but not enforced, my mental model is to just let the client make
> whatever choice it wants. If a server really thinks it's advantageous to
> use the remote planning, I'd think it'd just say server side planning is
> enforced. For the "momentary load" case, all a client would need to do is
> just handle the server throttling and fallback to a client side planning
> (don't think the protocol needs to expand just for that).
>
> On Wed, Jan 7, 2026 at 11:28 AM Russell Spitzer <[email protected]>
> wrote:
>
>> I'm in agreement with Prashsant's current plan, I have no preference on
>> naming of Only vs Enforced"
>>
>> On Wed, Jan 7, 2026 at 4:42 AM Eduard Tudenhöfner <
>> [email protected]> wrote:
>>
>>> Instead of calling it "ONLY", maybe "ENFORCED" would be a better term? I
>>> think that would more naturally express the behavior without having to
>>> define what "ONLY" really means.
>>>
>>> On Wed, Dec 24, 2025 at 12:05 AM Prashant Singh <
>>> [email protected]> wrote:
>>>
>>>> *Hi everyone,*
>>>>
>>>> *JB:* Mostly yes, but it's more about what the server wants the client
>>>> to do. The server can indicate if it supports a mode or not via the
>>>> /v1/config endpoint at this point.
>>>>
>>>> *Russell:* Thank you for the thorough feedback! I think it is a great
>>>> idea to break the optional mode into *Prefer Client | Prefer Catalog*—it
>>>> really opens up a lot of interesting use cases.
>>>>
>>>> For example, the server might support planning but, due to momentary
>>>> load, wants the client to see if it's open to planning on the client side.
>>>> Similarly, an argument can be made that if the server has a table cached in
>>>> memory, it would prefer the client comes to the server. Earlier, with just
>>>> the optional value, we were simply falling back to server or client side
>>>> planning based on whether the server supported scan planning. Now, the
>>>> client can express its own overrides via catalog configs as well.
>>>>
>>>> Based on our offline discussion, I have incorporated the feedback into
>>>> the updated matrix [1] to document what the planning modes would be based
>>>> on the server response and client overrides:
>>>>
>>>>    -
>>>>
>>>>    *CLIENT_ONLY + CATALOG_ONLY* = FAIL
>>>>    -
>>>>
>>>>    *One "ONLY" + opposite "PREFERRED"* = ONLY wins
>>>>    -
>>>>
>>>>    *Both "PREFERRED"* = Client config wins
>>>>    -
>>>>
>>>>    *Client not configured* = Use server config or default
>>>>
>>>> I will update the reference implementation soon based on this. I would
>>>> love to know what other folks think!
>>>>
>>>> Best,
>>>>
>>>> Prashant Singh
>>>>
>>>> [1]
>>>> https://github.com/apache/iceberg/pull/14867#issuecomment-3683989832
>>>>
>>>> On Sat, Dec 20, 2025 at 1:26 PM Russell Spitzer <
>>>> [email protected]> wrote:
>>>>
>>>>> I can imagine one more
>>>>>
>>>>>
>>>>> (None - I would rename this) ClientOnly - Client can use Catalog
>>>>> Planning or Local Planning
>>>>>
>>>>> PreferClient - Client should use local planning, but the plan api is
>>>>> available for this table — I can only imagine this would be useful for a
>>>>> scenario where most clients are heavy and have the resources to do local
>>>>> planning (or engine distributed planning) but you still want to support
>>>>> lightweight clients which can’t really do planning themselves.
>>>>>
>>>>> PreferCatalog - Client should use the plan API, but credentials have
>>>>> been provided to enable local planning — This is probably a transitional
>>>>> state as we move from clients that only support local planning to those
>>>>> which can use the plan api.
>>>>>
>>>>> CatalogOnly - Clients are not provided with the credentials required
>>>>> to read the table from the Metadata.json alone. If they do not implement
>>>>> the scan plan API they should fail fast, otherwise they will fail when 
>>>>> they
>>>>> attempt to load a manifest_list file — This is used in circumstances where
>>>>> the catalog is giving either file specific credentials or is protecting 
>>>>> the
>>>>> delivered files in some way such that their contents has been specially
>>>>> redacted or something like that.
>>>>>
>>>>>
>>>>> I assume most catalogs will start with “ClientOnly” or “None”
>>>>>
>>>>> Then as Catalogs being to support planning API we will see most tables
>>>>> move to
>>>>> PreferCatalog with some perhaps extremely heavy or large tables
>>>>> staying as PreferClient or Client Only.
>>>>>
>>>>> Then catalogs with special protections may have some tables return
>>>>>  CatalogOnly so they can either scope credentials more tightly or
>>>>> manipulate the files that the client actually has access to in some way.
>>>>>
>>>>> On Sat, Dec 20, 2025 at 1:09 AM Jean-Baptiste Onofré <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi Prashant
>>>>>>
>>>>>> It makes sense to me. I guess we are using Catalog properties to
>>>>>> indicate what the REST server supports to the client, right ?
>>>>>> I will take a look at the PR, but I like the idea.
>>>>>>
>>>>>> Regards
>>>>>> JB
>>>>>>
>>>>>> On Sat, Dec 20, 2025 at 12:53 AM Prashant Singh <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hey All,
>>>>>>>
>>>>>>> I wanted to bring up the discussion of introducing a concept of rest
>>>>>>> scan planning mode which would help the server to instruct the client on
>>>>>>> how to plan the table via loadTableResponse or config at table level
>>>>>>> override.
>>>>>>> There are three possible values which one could think of :
>>>>>>> 1. *None* : i.e plan it on the client side, this may be the table
>>>>>>> is too small and the additional rest request would add more overhead 
>>>>>>> than
>>>>>>> benefit.
>>>>>>> 2. *Optional* : client can choose to plan it either locally or can
>>>>>>> trigger server side planning.
>>>>>>> 3. *Required* : client MUST do server side planning, the server
>>>>>>> could suggest this if it has better indexed the iceberg metadata or 
>>>>>>> client
>>>>>>> is running on low resources or the table is protected. Server MAY choose
>>>>>>> whatever way required to enforce the client cant bypass this for example
>>>>>>> let's say don't vend cred as part of loadTable and only mint it part of
>>>>>>> planning completion this would mean if the client doesn't call plan 
>>>>>>> table .
>>>>>>>
>>>>>>> I proactively have created a pull request [1], would love to know
>>>>>>> all your feedback either here or in the PR directly !
>>>>>>>
>>>>>>> Wish you all a very happy Holidays, it has been great working with
>>>>>>> you all.
>>>>>>>
>>>>>>> [1] https://github.com/apache/iceberg/pull/14867
>>>>>>>
>>>>>>> Best,
>>>>>>> Prashant Singh
>>>>>>>
>>>>>>

Reply via email to