[ 
https://issues.apache.org/jira/browse/SPARK-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527556#comment-16527556
 ] 

Marek Novotny commented on SPARK-24165:
---------------------------------------

It seems that Spark is not able resolve nullability for nested types correctly.

{{val rows = new util.ArrayList[Row]()}}
{{rows.add(Row(true, ("1", 1)))}}
{{rows.add(Row(false, (null, 2)))}}
{{val schema = StructType(Seq(}}
{{ StructField("cond", BooleanType, false),}}
{{ StructField("s", StructType(Seq(}}
{{ StructField("val1", StringType, true),}}
{{ StructField("val2", IntegerType, false)}}
{{ )))}}
{{))}}

{{val df = spark.createDataFrame(rows, schema)}}

{{df.select(when('cond, expr("struct('x' as val1, 10 as val2)")).otherwise('s) 
as "result").printSchema()}}

Result:

{{root}}
{{ |-- result: struct (nullable = true)}}
{{ | |-- val1: string (nullable = *{color:#FF0000}false{color}*)}}
{{ | |-- val2: integer (nullable = false)}}

 

I will take a look at the problem.

 

> UDF within when().otherwise() raises NullPointerException
> ---------------------------------------------------------
>
>                 Key: SPARK-24165
>                 URL: https://issues.apache.org/jira/browse/SPARK-24165
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Jingxuan Wang
>            Priority: Major
>
> I have a UDF which takes java.sql.Timestamp and String as input column type 
> and returns an Array of (Seq[case class], Double) as output. Since some of 
> values in input columns can be nullable, I put the UDF inside a 
> when($input.isNull, null).otherwise(UDF) filter. Such function works well 
> when I test in spark shell. But running as a scala jar in spark-submit with 
> yarn cluster mode, it raised NullPointerException which points to the UDF 
> function. If I remove the when().otherwsie() condition, but put null check 
> inside the UDF, the function works without issue in spark-submit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to