[ https://issues.apache.org/jira/browse/SPARK-47946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Junyoung Cho updated SPARK-47946: --------------------------------- Description: I've got error when append to table using DataFrameWriterV2. The error was occured in TableOutputResolver.checkNullability. This error occurs when the data type of the schema is the same, but the order of the fields is different. I found that GetStructField.nullable returns unexpected result. {code:java} override def nullable: Boolean = child.nullable || childSchema(ordinal).nullable {code} Even if nested field has not nullability attribute, it returns true when parent struct has nullability attribute. ||Parent nullability||Child nullability||Result|| |true|true|true| |{color:#ff0000}true{color}|{color:#ff0000}false{color}|{color:#ff0000}true{color}| |{color:#172b4d}false{color}|{color:#172b4d}true{color}|{color:#172b4d}true{color}| |false|false|false| I think the logic should be changed to get just child's nullability, because both of parent and child should be nullable to be considered nullable. {code:java} override def nullable: Boolean = childSchema(ordinal).nullable {code} I want to check current logic is reasonable, or my suggestion can occur other side effect. was: I've got error when append to table using DataFrameWriterV2. The error was occured in TableOutputResolver.checkNullability. This error occurs when the data type of the schema is the same, but the order of the fields is different. I found that GetStructField.nullable returns unexpected result. {code:java} override def nullable: Boolean = child.nullable || childSchema(ordinal).nullable {code} Even if nested field has not nullability attribute, it returns true when parent struct has nullability attribute. ||Parent nullability||Child nullability||Result|| |true|true|true| |{color:#ff0000}true{color}|{color:#ff0000}false{color}|{color:#ff0000}true{color}| |{color:#ff0000}false{color}|{color:#ff0000}true{color}|{color:#ff0000}true{color}| |false|false|false| I think the logic should be changed to AND operation, because both of parent and child should be nullable to be considered nullable. {code:java} override def nullable: Boolean = child.nullable || childSchema(ordinal).nullable {code} I want to check current logic is reasonable, or my suggestion can occur other side effect. > Nested field's nullable value could be invalid after extracted using > GetStructField > ----------------------------------------------------------------------------------- > > Key: SPARK-47946 > URL: https://issues.apache.org/jira/browse/SPARK-47946 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL > Affects Versions: 3.4.2 > Reporter: Junyoung Cho > Priority: Major > > I've got error when append to table using DataFrameWriterV2. > The error was occured in TableOutputResolver.checkNullability. This error > occurs when the data type of the schema is the same, but the order of the > fields is different. > I found that GetStructField.nullable returns unexpected result. > {code:java} > override def nullable: Boolean = child.nullable || > childSchema(ordinal).nullable {code} > Even if nested field has not nullability attribute, it returns true when > parent struct has nullability attribute. > ||Parent nullability||Child nullability||Result|| > |true|true|true| > |{color:#ff0000}true{color}|{color:#ff0000}false{color}|{color:#ff0000}true{color}| > |{color:#172b4d}false{color}|{color:#172b4d}true{color}|{color:#172b4d}true{color}| > |false|false|false| > > I think the logic should be changed to get just child's nullability, because > both of parent and child should be nullable to be considered nullable. > > {code:java} > override def nullable: Boolean = childSchema(ordinal).nullable {code} > > > > I want to check current logic is reasonable, or my suggestion can occur other > side effect. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org