Option 2 seems fine to me.

2020년 3월 17일 (화) 오후 3:41, Wenchen Fan <cloud0...@gmail.com>님이 작성:

> I don't think option 1 is possible.
>
> For option 2: I think we need to do it anyway. It's kind of a bug that the
> typed Scala UDF doesn't support case class that thus can't support
> struct-type input columns.
>
> For option 3: It's a bit risky to add a new API but seems like we have a
> good reason. The untyped Scala UDF supports Row as input/output, which is a
> valid use case to me. It requires a "returnType" parameter, but not input
> types. This brings 2 problems: 1) if the UDF parameter is primitive-type
> but the actual value is null, the result will be wrong. 2) Analyzer can't
> do type check for the UDF.
>
> Maybe we can add a new method def udf(f: AnyRef, inputTypes:
> Seq[(DataType, Boolean)], returnType: DataType), to allow users to
> specify the expected input data types and nullablilities.
>
> On Tue, Mar 17, 2020 at 1:58 PM wuyi <yi...@databricks.com> wrote:
>
>> Thanks Sean and Takeshi.
>>
>> Option 1 seems really impossible. And I'm going to take Option 2 as an
>> alternative choice.
>>
>>
>>
>> --
>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

Reply via email to