[ 
https://issues.apache.org/jira/browse/SEDONA-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18022976#comment-18022976
 ] 

Jia Yu commented on SEDONA-744:
-------------------------------

[~groenewt] Hi Tristan, thanks for creating the issue. The Sedona + Iceberg Geo 
implementation is blocked by the review of this PR: 
https://github.com/apache/iceberg/pull/12667

> Sedona Geometry/Geography Spark UDTs not compatible with Iceberg v3 native 
> Geometry/Geography types
> ---------------------------------------------------------------------------------------------------
>
>                 Key: SEDONA-744
>                 URL: https://issues.apache.org/jira/browse/SEDONA-744
>             Project: Apache Sedona
>          Issue Type: Bug
>         Environment: "Spark": {
>   "id": "Spark",
>   "name": "Spark",
>   "group": "spark",
>   "properties": {
>     "SPARK_HOME": {
>       "name": "SPARK_HOME",
>       "value": "/opt/spark/",
>       "type": "string"
>     },
>     "spark.master": {
>       "name": "spark.master",
>       "value": "local[*]",
>       "type": "string"
>     },
>     "spark.submit.deployMode": {
>       "name": "spark.submit.deployMode",
>       "value": "client",
>       "type": "string"
>     },
>     "spark.app.name": {
>       "name": "spark.app.name",
>       "value": "Zeppelin",
>       "type": "string"
>     },
>     "spark.driver.cores": { "value": "8" },
>     "spark.driver.memory": { "value": "32g" },
>     "spark.executor.cores": { "value": "8" },
>     "spark.executor.memory": { "value": "24g" },
>     "spark.executor.instances": { "value": "1" },
>     "spark.jars": {
>       "value": 
> "/opt/spark/jars/iceberg-spark-runtime-3.5_2.12-1.9.2.jar,/opt/spark/jars/sedona-spark-shaded-3.5_2.12-1.8.0.jar,/opt/spark/jars/geotools-wrapper-1.8.0-33.1.jar"
>     },
>     "zeppelin.spark.useHiveContext": { "value": "true" },
>     "spark.sql.extensions": {
>       "value": 
> "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.apache.sedona.sql.SedonaSqlExtensions"
>     },
>     "spark.sql.catalog.iceberg.uri": {
>       "value": "thrift://hive-metastore:9083"
>     },
>     "spark.sql.catalog.iceberg.warehouse": {
>       "value": "ofs://omservice/warehouse/iceberg"
>     },
>     ### 🔒 Trimmed secrets ###
>     "spark.hadoop.javax.jdo.option.ConnectionUserName": {
>       "value": "***REDACTED***"
>     },
>     "spark.hadoop.javax.jdo.option.ConnectionURL": {
>       "value": "***REDACTED***"
>     },
>     "spark.hadoop.javax.jdo.option.ConnectionPassword": {
>       "value": "***REDACTED***"
>     },
>     "spark.hive.metastore.uris": {
>       "value": "thrift://hive-metastore:9083"
>     },
>     "spark.sql.hive.metastore.version": { "value": "3.1.3" },
>     "spark.sql.hive.metastore.jars": { "value": "maven" },
>     "spark.sql.hive.metastore.sharedPrefixes": {
>       "value": 
> "org.postgresql,org.apache.hadoop.hive.ql.io.parquet,org.apache.hadoop.hive.serde2"
>     },
>     "spark.sql.hive.convertMetastoreParquet": { "value": "false" },
>     "spark.sql.hive.convertMetastoreOrc": { "value": "false" },
>     "spark.sql.hive.caseSensitiveInferenceMode": { "value": "NEVER_INFER" },
>     "spark.hadoop.fs.ofs.impl": {
>       "value": "org.apache.hadoop.fs.ozone.RootedOzoneFileSystem"
>     },
>     "spark.hadoop.fs.AbstractFileSystem.ofs.impl": {
>       "value": "org.apache.hadoop.fs.ozone.OzoneFileSystem"
>     },
>     "spark.hadoop.fs.defaultFS": { "value": "ofs://omservice/" },
>     "spark.sql.warehouse.dir": { "value": "ofs://omservice/warehouse/spark" },
>     "spark.executor.heartbeatInterval": { "value": "120s" },
>     "spark.shuffle.service.enabled": { "value": "false" },
>     "spark.dynamicAllocation.enabled": { "value": "false" },
>     "spark.driver.bindAddress": { "value": "0.0.0.0" },
>     "spark.network.timeout": { "value": "1200s" },
>     "spark.sql.shuffle.partitions": { "value": "50" },
>     "spark.sql.adaptive.enabled": { "value": "true" },
>     "spark.sql.adaptive.coalescePartitions.enabled": { "value": "true" },
>     "spark.storage.blockManagerSlaveTimeoutMs": { "value": "1200000" },
>     "spark.shuffle.registration.timeout": { "value": "120000" },
>     "spark.shuffle.registration.maxAttempts": { "value": "8" },
>     "spark.sql.iceberg.vectorization.enabled": { "value": "true" }
>   }
> }
>            Reporter: Tristan Groenewold
>            Priority: Blocker
>              Labels: Type-Defect, Type-Enhancement, iceberg
>             Fix For: 1.8.0
>
>
> When using Sedona with Spark 3.5 and Iceberg v3 tables, attempts to create an 
> Iceberg table with {{geometry}} or {{geography}} fail. Iceberg v3 defines 
> these as native types, but Sedona registers them as Spark User-Defined Types 
> (UDTs). Spark’s SQL layer rejects UDTs in DDL for Iceberg tables with the 
> error:
> ```python
> pyspark.errors.exceptions.captured.UnsupportedOperationException: 
> User-defined types are not supported
> ```
> *Reproduction Code*
> ```code 
> %Spark.pyspark
> from sedona.spark import *
> from sedona.register import SedonaRegistrator
> from pyspark.sql.functions import expr
> sedona = SedonaContext.create(spark)
> SedonaRegistrator.registerAll(spark)
> # Create Iceberg v3 table with geometry column
> spark.sql("""
> CREATE TABLE iceberg.geo.icetable2 (id string, geometry geometry)
> USING iceberg
> TBLPROPERTIES('format-version'='3')
> """)
> *Observed Behavior*
> Fails with {{{}User-defined types are not supported{}}}.
> *Expected Behavior*
> Sedona geometries should be writable/readable as Iceberg v3 native 
> {{geometry}} (and eventually {{{}geography{}}}) columns.
> *Possible Approaches*
>  * Align Sedona’s UDT registration with Spark’s logical type system so that 
> Iceberg recognizes {{{}geometry{}}}/{{{}geography{}}} as native types.
>  * Provide a type mapping bridge layer: Sedona UDT ↔ Iceberg v3 type.
>  * Add explicit serializers/deserializers for Iceberg’s geometry type.
>  * PRIOR EXISTING Community Approach (Spark 3.1,3.2,3.3 compatibility only)   
> : ([https://github.com/spatialx-project/sedona-iceberg-extension/) 
> |https://github.com/spatialx-project/sedona-iceberg-extension/)]
>  *  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to