Huaxin Gao created SPARK-51479: ---------------------------------- Summary: Nullable in Row Level Operation Column is not correct Key: SPARK-51479 URL: https://issues.apache.org/jira/browse/SPARK-51479 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 4.0.0 Reporter: Huaxin Gao
I got a few test failures for iceberg / Spark 4.0 integration. I think the root cause is that the Nullable in Row Level Operation Column is not correct. Here is the test failure: {quote}TestMergeOnReadUpdate > testUpdateWithMultiColumnInSubquery() > catalogName = spark_catalog, implementation = org.apache.iceberg.spark.SparkSessionCatalog, config = {type=hive, default-namespace=default, clients=1, parquet-enabled=false, cache-enabled=false}, format = AVRO, vectorized = false, distributionMode = range, fanout = false, branch = test, planningMode = DISTRIBUTED, formatVersion = 3 FAILED java.lang.IllegalArgumentException: Provided metadata schema is incompatible with expected schema: table { 2147483643: _spec_id: required int (Spec ID used to track the file containing a row) 2147483642: _partition: optional struct<> (Partition to which a row belongs to) } Provided schema: table { 2147483643: _spec_id: optional int 2147483642: _partition: optional struct<> } Problems: * _spec_id should be required, but is optional at org.apache.iceberg.types.TypeUtil.checkSchemaCompatibility(TypeUtil.java:493){quote} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org