Huaxin Gao created SPARK-51479:
----------------------------------

             Summary: Nullable in Row Level Operation Column is not correct
                 Key: SPARK-51479
                 URL: https://issues.apache.org/jira/browse/SPARK-51479
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 4.0.0
            Reporter: Huaxin Gao


I got a few test failures for iceberg / Spark 4.0 integration. I think the root 
cause is that the Nullable in Row Level Operation Column is not correct.

Here is the test failure:
{quote}TestMergeOnReadUpdate > testUpdateWithMultiColumnInSubquery() > 
catalogName = spark_catalog, implementation = 
org.apache.iceberg.spark.SparkSessionCatalog, config = {type=hive, 
default-namespace=default, clients=1, parquet-enabled=false, 
cache-enabled=false}, format = AVRO, vectorized = false, distributionMode = 
range, fanout = false, branch = test, planningMode = DISTRIBUTED, formatVersion 
= 3 FAILED
    java.lang.IllegalArgumentException: Provided metadata schema is 
incompatible with expected schema:
    table {
      2147483643: _spec_id: required int (Spec ID used to track the file 
containing a row)
      2147483642: _partition: optional struct<> (Partition to which a row 
belongs to)
    }
    Provided schema:
    table {
      2147483643: _spec_id: optional int
      2147483642: _partition: optional struct<>
    }
    Problems:
    * _spec_id should be required, but is optional
        at 
org.apache.iceberg.types.TypeUtil.checkSchemaCompatibility(TypeUtil.java:493){quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to