sample UDF val getlength=udf((data:String)=>data.length()) data.select(getlength(data("col1")))
On Fri, Jun 16, 2017 at 9:21 AM, lk_spark <lk_sp...@163.com> wrote: > hi,all > I define a udf with multiple parameters ,but I don't know how to > call it with DataFrame > > UDF: > > def ssplit2 = udf { (sentence: String, delNum: Boolean, delEn: Boolean, > minTermLen: Int) => > val terms = HanLP.segment(sentence).asScala > ..... > > Call : > > scala> val output = input.select(ssplit2($"text",true,true,2).as('words)) > <console>:40: error: type mismatch; > found : Boolean(true) > required: org.apache.spark.sql.Column > val output = input.select(ssplit2($"text",true,true,2).as('words)) > ^ > <console>:40: error: type mismatch; > found : Boolean(true) > required: org.apache.spark.sql.Column > val output = input.select(ssplit2($"text",true,true,2).as('words)) > ^ > <console>:40: error: type mismatch; > found : Int(2) > required: org.apache.spark.sql.Column > val output = input.select(ssplit2($"text",true,true,2).as('words)) > ^ > > scala> val output = input.select(ssplit2($"text",$ > "true",$"true",$"2").as('words)) > org.apache.spark.sql.AnalysisException: cannot resolve '`true`' given > input columns: [id, text];; > 'Project [UDF(text#6, 'true, 'true, '2) AS words#16] > +- Project [_1#2 AS id#5, _2#3 AS text#6] > +- LocalRelation [_1#2, _2#3] > > I need help!! > > > 2017-06-16 > ------------------------------ > lk_spark >