Shardul Mahadik created SPARK-37822: ---------------------------------------
Summary: SQL function `split` should return an array of non-nullable elements Key: SPARK-37822 URL: https://issues.apache.org/jira/browse/SPARK-37822 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Shardul Mahadik Currently, {{split}} [returns the data type|https://github.com/apache/spark/blob/08dd010860cc176a33073928f4c0780d0ee98a08/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L532] {{ArrayType(StringType)}} which means the resultant array can contain nullable elements. However I do not see any case where the array can contain nulls. In the case where either the provided string or delimiter is NULL, the output will be a NULL array. In case of empty string or no chars between delemiters, the output array will contain empty strings but never NULLs. So I propose we change the return type of {{split}} to mark elements as non-null. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org