Re: [PR] Docs: clarify SparkSessionCatalog function limitations [iceberg]

via GitHub Mon, 23 Mar 2026 20:54:50 -0700


pan3793 commented on code in PR #15736:
URL: https://github.com/apache/iceberg/pull/15736#discussion_r2978827887



##########
docs/docs/spark-queries.md:
##########
@@ -51,6 +51,14 @@ writing filters that match Iceberg partition transforms. 
These functions are ava
 [Iceberg catalog](spark-configuration.md#catalog-configuration); they are not 
registered in Spark's
 built-in catalog.
 
+!!! note
+    In Spark versions before 4.2.0, `SparkSessionCatalog` does not expose 
Iceberg's `system`
+    namespace (see SPARK-54760). Queries such as `SELECT 
spark_catalog.system.bucket(16, id)`

Review Comment:
   @kevinjqliu, to clarify, I initially created SPARK-54760 as a bug, but 
during discussion, it was eventually classified as a missing feature. I updated 
the Jira ticket to reflect that.
   
   I think we can use simple words to explain that, e.g.,
   
   > Spark before 4.2.0 does not support V2Function in the session catalog, see 
[SPARK-54760](https://issues.apache.org/jira/browse/SPARK-54760) for details.
   
   
   
   



##########
docs/docs/spark-configuration.md:
##########
@@ -112,6 +112,13 @@ Spark's built-in catalog supports existing v1 and v2 
tables tracked in a Hive Me
 
 This configuration can use same Hive Metastore for both Iceberg and 
non-Iceberg tables.
 
+`SparkSessionCatalog` is useful when you want `spark_catalog` to work with 
both Iceberg and non-Iceberg
+tables in the same metastore. It is not a full replacement for a dedicated 
Iceberg catalog, though.

Review Comment:
   I would not say these words



##########
docs/docs/spark-configuration.md:
##########
@@ -112,6 +112,13 @@ Spark's built-in catalog supports existing v1 and v2 
tables tracked in a Hive Me
 
 This configuration can use same Hive Metastore for both Iceberg and 
non-Iceberg tables.
 
+`SparkSessionCatalog` is useful when you want `spark_catalog` to work with 
both Iceberg and non-Iceberg
+tables in the same metastore. It is not a full replacement for a dedicated 
Iceberg catalog, though.
+In Spark versions before 4.2.0, `SparkSessionCatalog` does not expose 
Iceberg's `system` namespace
+(see SPARK-54760), so catalog-scoped SQL functions such as `system.bucket`, 
`system.days`, and
+`system.iceberg_version` are not available through `spark_catalog`. To use 
those functions, configure a
+separate Iceberg catalog with `org.apache.iceberg.spark.SparkCatalog` and call 
them through that catalog.

Review Comment:
   will this introduce any side effects when users configure two catalogs point 
to the same catalog? (e.g., two Hive catalogs with the same HMS)
   
   to be conservative, maybe explicitly say "workaround"
   
   "To use those functions" > "To workaround this limitation"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Docs: clarify SparkSessionCatalog function limitations [iceberg]

Reply via email to