Re: [DISCUSS] Null-handling of primitive-type of untyped Scala UDF in Scala 2.12

Wenchen Fan Mon, 16 Mar 2020 23:41:28 -0700

I don't think option 1 is possible.

For option 2: I think we need to do it anyway. It's kind of a bug that the
typed Scala UDF doesn't support case class that thus can't support
struct-type input columns.

For option 3: It's a bit risky to add a new API but seems like we have a
good reason. The untyped Scala UDF supports Row as input/output, which is a
valid use case to me. It requires a "returnType" parameter, but not input
types. This brings 2 problems: 1) if the UDF parameter is primitive-type
but the actual value is null, the result will be wrong. 2) Analyzer can't
do type check for the UDF.

Maybe we can add a new method def udf(f: AnyRef, inputTypes: Seq[(DataType,
Boolean)], returnType: DataType), to allow users to specify the expected
input data types and nullablilities.

On Tue, Mar 17, 2020 at 1:58 PM wuyi <yi...@databricks.com> wrote:

> Thanks Sean and Takeshi.
>
> Option 1 seems really impossible. And I'm going to take Option 2 as an
> alternative choice.
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Re: [DISCUSS] Null-handling of primitive-type of untyped Scala UDF in Scala 2.12

Reply via email to