[
https://issues.apache.org/jira/browse/SEDONA-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18022976#comment-18022976
]
Jia Yu commented on SEDONA-744:
-------------------------------
[~groenewt] Hi Tristan, thanks for creating the issue. The Sedona + Iceberg Geo
implementation is blocked by the review of this PR:
https://github.com/apache/iceberg/pull/12667
> Sedona Geometry/Geography Spark UDTs not compatible with Iceberg v3 native
> Geometry/Geography types
> ---------------------------------------------------------------------------------------------------
>
> Key: SEDONA-744
> URL: https://issues.apache.org/jira/browse/SEDONA-744
> Project: Apache Sedona
> Issue Type: Bug
> Environment: "Spark": {
> "id": "Spark",
> "name": "Spark",
> "group": "spark",
> "properties": {
> "SPARK_HOME": {
> "name": "SPARK_HOME",
> "value": "/opt/spark/",
> "type": "string"
> },
> "spark.master": {
> "name": "spark.master",
> "value": "local[*]",
> "type": "string"
> },
> "spark.submit.deployMode": {
> "name": "spark.submit.deployMode",
> "value": "client",
> "type": "string"
> },
> "spark.app.name": {
> "name": "spark.app.name",
> "value": "Zeppelin",
> "type": "string"
> },
> "spark.driver.cores": { "value": "8" },
> "spark.driver.memory": { "value": "32g" },
> "spark.executor.cores": { "value": "8" },
> "spark.executor.memory": { "value": "24g" },
> "spark.executor.instances": { "value": "1" },
> "spark.jars": {
> "value":
> "/opt/spark/jars/iceberg-spark-runtime-3.5_2.12-1.9.2.jar,/opt/spark/jars/sedona-spark-shaded-3.5_2.12-1.8.0.jar,/opt/spark/jars/geotools-wrapper-1.8.0-33.1.jar"
> },
> "zeppelin.spark.useHiveContext": { "value": "true" },
> "spark.sql.extensions": {
> "value":
> "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.apache.sedona.sql.SedonaSqlExtensions"
> },
> "spark.sql.catalog.iceberg.uri": {
> "value": "thrift://hive-metastore:9083"
> },
> "spark.sql.catalog.iceberg.warehouse": {
> "value": "ofs://omservice/warehouse/iceberg"
> },
> ### 🔒 Trimmed secrets ###
> "spark.hadoop.javax.jdo.option.ConnectionUserName": {
> "value": "***REDACTED***"
> },
> "spark.hadoop.javax.jdo.option.ConnectionURL": {
> "value": "***REDACTED***"
> },
> "spark.hadoop.javax.jdo.option.ConnectionPassword": {
> "value": "***REDACTED***"
> },
> "spark.hive.metastore.uris": {
> "value": "thrift://hive-metastore:9083"
> },
> "spark.sql.hive.metastore.version": { "value": "3.1.3" },
> "spark.sql.hive.metastore.jars": { "value": "maven" },
> "spark.sql.hive.metastore.sharedPrefixes": {
> "value":
> "org.postgresql,org.apache.hadoop.hive.ql.io.parquet,org.apache.hadoop.hive.serde2"
> },
> "spark.sql.hive.convertMetastoreParquet": { "value": "false" },
> "spark.sql.hive.convertMetastoreOrc": { "value": "false" },
> "spark.sql.hive.caseSensitiveInferenceMode": { "value": "NEVER_INFER" },
> "spark.hadoop.fs.ofs.impl": {
> "value": "org.apache.hadoop.fs.ozone.RootedOzoneFileSystem"
> },
> "spark.hadoop.fs.AbstractFileSystem.ofs.impl": {
> "value": "org.apache.hadoop.fs.ozone.OzoneFileSystem"
> },
> "spark.hadoop.fs.defaultFS": { "value": "ofs://omservice/" },
> "spark.sql.warehouse.dir": { "value": "ofs://omservice/warehouse/spark" },
> "spark.executor.heartbeatInterval": { "value": "120s" },
> "spark.shuffle.service.enabled": { "value": "false" },
> "spark.dynamicAllocation.enabled": { "value": "false" },
> "spark.driver.bindAddress": { "value": "0.0.0.0" },
> "spark.network.timeout": { "value": "1200s" },
> "spark.sql.shuffle.partitions": { "value": "50" },
> "spark.sql.adaptive.enabled": { "value": "true" },
> "spark.sql.adaptive.coalescePartitions.enabled": { "value": "true" },
> "spark.storage.blockManagerSlaveTimeoutMs": { "value": "1200000" },
> "spark.shuffle.registration.timeout": { "value": "120000" },
> "spark.shuffle.registration.maxAttempts": { "value": "8" },
> "spark.sql.iceberg.vectorization.enabled": { "value": "true" }
> }
> }
> Reporter: Tristan Groenewold
> Priority: Blocker
> Labels: Type-Defect, Type-Enhancement, iceberg
> Fix For: 1.8.0
>
>
> When using Sedona with Spark 3.5 and Iceberg v3 tables, attempts to create an
> Iceberg table with {{geometry}} or {{geography}} fail. Iceberg v3 defines
> these as native types, but Sedona registers them as Spark User-Defined Types
> (UDTs). Spark’s SQL layer rejects UDTs in DDL for Iceberg tables with the
> error:
> ```python
> pyspark.errors.exceptions.captured.UnsupportedOperationException:
> User-defined types are not supported
> ```
> *Reproduction Code*
> ```codeÂ
> %Spark.pyspark
> from sedona.spark import *
> from sedona.register import SedonaRegistrator
> from pyspark.sql.functions import expr
> sedona = SedonaContext.create(spark)
> SedonaRegistrator.registerAll(spark)
> # Create Iceberg v3 table with geometry column
> spark.sql("""
> CREATE TABLE iceberg.geo.icetable2 (id string, geometry geometry)
> USING iceberg
> TBLPROPERTIES('format-version'='3')
> """)
> *Observed Behavior*
> Fails with {{{}User-defined types are not supported{}}}.
> *Expected Behavior*
> Sedona geometries should be writable/readable as Iceberg v3 native
> {{geometry}} (and eventually {{{}geography{}}}) columns.
> *Possible Approaches*
> * Align Sedona’s UDT registration with Spark’s logical type system so that
> Iceberg recognizes {{{}geometry{}}}/{{{}geography{}}} as native types.
> * Provide a type mapping bridge layer: Sedona UDT ↔ Iceberg v3 type.
> * Add explicit serializers/deserializers for Iceberg’s geometry type.
> * PRIOR EXISTING Community Approach (Spark 3.1,3.2,3.3 compatibility only)Â Â
> : ([https://github.com/spatialx-project/sedona-iceberg-extension/)
> |https://github.com/spatialx-project/sedona-iceberg-extension/)]
> * Â
--
This message was sent by Atlassian Jira
(v8.20.10#820010)