Re: [PR] [SPEC | CORE] : Allow table level override for scan planning [iceberg]

via GitHub Sat, 10 Jan 2026 11:51:37 -0800


singhpk234 commented on code in PR #14867:
URL: https://github.com/apache/iceberg/pull/14867#discussion_r2678933737



##########
open-api/rest-catalog-open-api.py:
##########
@@ -1291,6 +1291,29 @@ class LoadTableResult(BaseModel):
     ## General Configurations
 
     - `token`: Authorization bearer token to use for table requests if OAuth2 
security is enabled
+    - `scan-planning-mode`: Controls scan planning behavior for table 
operations. This property can be configured by:
+      - **Server**: Returned in `LoadTableResponse.config()` to advertise 
server preference/requirement
+      - **Client**: Set in catalog properties to override server configuration
+
+      **Configuration Precedence**: Client config > Server config > Default 
(`client-preferred`)
+
+      **Valid values**:
+      - `client-only`: MUST use client-side planning. Fails if paired with 
server's `catalog-only`.
+      - `client-preferred` (default): Prefer client-side planning but flexible.
+      - `catalog-preferred`: Prefer server-side planning but flexible. Falls 
back to client if server doesn't support planning endpoints.
+      - `catalog-only`: MUST use server-side planning. Requires server 
support. Fails if paired with client's `client-only`.
+
+    ### Scan Planning Negotiation
+
+    When both client and server provide `scan-planning-mode` configuration, 
the final planning decision is negotiated based on the following rules:
+
+    **Negotiation Rules:**
+    - **Incompatible requirements**: `client-only` + `catalog-only` = **FAIL**
+    - **ONLY beats PREFERRED**: When one side has "ONLY" and the other has 
"PREFERRED", the ONLY requirement wins (inflexible beats flexible)
+    - **Both PREFERRED**: When both are PREFERRED (different types), client 
config wins
+    - **Both same**: When both have the same value, use that planning type
+    - **Only one configured**: Use the configured side (client or server)
+    - **Neither configured**: Use default (`client-preferred`)

Review Comment:
   > the catalog without providing a way to migrate the data to another catalog
   
   we would still have a way to migrate, mostly because in the loadTable we 
give back the metadata.json pointer (which is self describing the table state), 
and its the catalog ADMIN would be able to use that pointer and register table 
to another REST or Metastore backed catalog. In the model where storage is 
decoupled from compute its the administrator of the catalog who has given 
access the catalog to vend storage creds and it can very well take it back.
   
   This feature is mostly like i want to read the table, can you help me with 
the data | delete files that corresponds to the table. Nevertheless i believe 
CATALOG_ONLY we think to be used primarily for gov cases also for things like 
scanning huge tables where planning can cause a lot of pressure on JVM (trino 
coordinar unstablity | spark requiring distributed planning) where catalog can 
do some efficient indexing (stuff like Redis) etc to help these engine.
   
   All in all IMHO i believe vendor lock in and not being able to migrate would 
not be possible by exposing this option, please let me know if i am missing 
something.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPEC | CORE] : Allow table level override for scan planning [iceberg]

Reply via email to