[ 
https://issues.apache.org/jira/browse/SPARK-47946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junyoung Cho updated SPARK-47946:
---------------------------------
    Description: 
I've got error when append to table using DataFrameWriterV2.

The error was occured in TableOutputResolver.checkNullability. This error 
occurs when the data type of the schema is the same, but the order of the 
fields is different.

I found that GetStructField.nullable returns unexpected result.
{code:java}
override def nullable: Boolean = child.nullable || 
childSchema(ordinal).nullable {code}
Even if nested field has not nullability attribute, it returns true when parent 
struct has nullability attribute.
||Parent nullability||Child nullability||Result||
|true|true|true|
|{color:#ff0000}true{color}|{color:#ff0000}false{color}|{color:#ff0000}true{color}|
|{color:#172b4d}false{color}|{color:#172b4d}true{color}|{color:#172b4d}true{color}|
|false|false|false|

 

I think the logic should be changed to get just child's nullability, because 
both of parent and child should be nullable to be considered nullable.

 
{code:java}
override def nullable: Boolean = childSchema(ordinal).nullable  {code}
 

 

 

I want to check current logic is reasonable, or my suggestion can occur other 
side effect.

  was:
I've got error when append to table using DataFrameWriterV2.

The error was occured in TableOutputResolver.checkNullability. This error 
occurs when the data type of the schema is the same, but the order of the 
fields is different.

I found that GetStructField.nullable returns unexpected result.
{code:java}
override def nullable: Boolean = child.nullable || 
childSchema(ordinal).nullable {code}
Even if nested field has not nullability attribute, it returns true when parent 
struct has nullability attribute.
||Parent nullability||Child nullability||Result||
|true|true|true|
|{color:#ff0000}true{color}|{color:#ff0000}false{color}|{color:#ff0000}true{color}|
|{color:#ff0000}false{color}|{color:#ff0000}true{color}|{color:#ff0000}true{color}|
|false|false|false|

 

I think the logic should be changed to AND operation, because both of parent 
and child should be nullable to be considered nullable.

 
{code:java}
override def nullable: Boolean = child.nullable || 
childSchema(ordinal).nullable  {code}
 

 

 

I want to check current logic is reasonable, or my suggestion can occur other 
side effect.


> Nested field's nullable value could be invalid after extracted using 
> GetStructField
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-47946
>                 URL: https://issues.apache.org/jira/browse/SPARK-47946
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>    Affects Versions: 3.4.2
>            Reporter: Junyoung Cho
>            Priority: Major
>
> I've got error when append to table using DataFrameWriterV2.
> The error was occured in TableOutputResolver.checkNullability. This error 
> occurs when the data type of the schema is the same, but the order of the 
> fields is different.
> I found that GetStructField.nullable returns unexpected result.
> {code:java}
> override def nullable: Boolean = child.nullable || 
> childSchema(ordinal).nullable {code}
> Even if nested field has not nullability attribute, it returns true when 
> parent struct has nullability attribute.
> ||Parent nullability||Child nullability||Result||
> |true|true|true|
> |{color:#ff0000}true{color}|{color:#ff0000}false{color}|{color:#ff0000}true{color}|
> |{color:#172b4d}false{color}|{color:#172b4d}true{color}|{color:#172b4d}true{color}|
> |false|false|false|
>  
> I think the logic should be changed to get just child's nullability, because 
> both of parent and child should be nullable to be considered nullable.
>  
> {code:java}
> override def nullable: Boolean = childSchema(ordinal).nullable  {code}
>  
>  
>  
> I want to check current logic is reasonable, or my suggestion can occur other 
> side effect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to