sample UDF
val getlength=udf((data:String)=>data.length())
data.select(getlength(data("col1")))

On Fri, Jun 16, 2017 at 9:21 AM, lk_spark <lk_sp...@163.com> wrote:

> hi,all
>      I define a udf with multiple parameters  ,but I don't know how to
> call it with DataFrame
>
> UDF:
>
> def ssplit2 = udf { (sentence: String, delNum: Boolean, delEn: Boolean,
> minTermLen: Int) =>
>     val terms = HanLP.segment(sentence).asScala
> .....
>
> Call :
>
> scala> val output = input.select(ssplit2($"text",true,true,2).as('words))
> <console>:40: error: type mismatch;
>  found   : Boolean(true)
>  required: org.apache.spark.sql.Column
>        val output = input.select(ssplit2($"text",true,true,2).as('words))
>                                                  ^
> <console>:40: error: type mismatch;
>  found   : Boolean(true)
>  required: org.apache.spark.sql.Column
>        val output = input.select(ssplit2($"text",true,true,2).as('words))
>                                                       ^
> <console>:40: error: type mismatch;
>  found   : Int(2)
>  required: org.apache.spark.sql.Column
>        val output = input.select(ssplit2($"text",true,true,2).as('words))
>                                                            ^
>
> scala> val output = input.select(ssplit2($"text",$
> "true",$"true",$"2").as('words))
> org.apache.spark.sql.AnalysisException: cannot resolve '`true`' given
> input columns: [id, text];;
> 'Project [UDF(text#6, 'true, 'true, '2) AS words#16]
> +- Project [_1#2 AS id#5, _2#3 AS text#6]
>    +- LocalRelation [_1#2, _2#3]
>
> I need help!!
>
>
> 2017-06-16
> ------------------------------
> lk_spark
>

Reply via email to