Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225693558 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -81,11 +81,11 @@ case class UserDefinedFunction protected[sql] ( f, dataType, exprs.map(_.expr), + nullableTypes.map(_.map(!_)).getOrElse(exprs.map(_ => false)), --- End diff -- It's new in the sense that Scala 2.12 forced us to change where we get type information for UDFS, or specifically whether the type is allowed to be null. Before we could infer it directly from the closure, but not in 2.12. This change is supposed to just get the info from elsewhere -- from TypeTags when the UDFs are declared. The change ought to be invisible to users and if done correctly should not result in a behavior change. This follow-up change is more about being conservative and lessening the chance of a mistake (see the test case mary ann found) internally or of a mistake for a library that uses ScalaUDF directly anyway. For that reason, yeah I agree with your comment here. Default to assuming null checks are needed.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org