val getlength=udf((idx1:Int,idx2:Int, data : String)=> data.substring(idx1,idx2))
data.select(getlength(lit(1),lit(2),data("col1"))).collect On Fri, Jun 16, 2017 at 10:22 AM, Pralabh Kumar <pralabhku...@gmail.com> wrote: > Use lit , give me some time , I'll provide an example > > On 16-Jun-2017 10:15 AM, "lk_spark" <lk_sp...@163.com> wrote: > >> thanks Kumar , I want to know how to cao udf with multiple parameters , >> maybe an udf to make a substr function,how can I pass parameter with begin >> and end index ? I try it with errors. Does the udf parameters could only >> be a column type? >> >> 2017-06-16 >> ------------------------------ >> lk_spark >> ------------------------------ >> >> *发件人:*Pralabh Kumar <pralabhku...@gmail.com> >> *发送时间:*2017-06-16 17:49 >> *主题:*Re: how to call udf with parameters >> *收件人:*"lk_spark"<lk_sp...@163.com> >> *抄送:*"user.spark"<user@spark.apache.org> >> >> sample UDF >> val getlength=udf((data:String)=>data.length()) >> data.select(getlength(data("col1"))) >> >> On Fri, Jun 16, 2017 at 9:21 AM, lk_spark <lk_sp...@163.com> wrote: >> >>> hi,all >>> I define a udf with multiple parameters ,but I don't know how to >>> call it with DataFrame >>> >>> UDF: >>> >>> def ssplit2 = udf { (sentence: String, delNum: Boolean, delEn: Boolean, >>> minTermLen: Int) => >>> val terms = HanLP.segment(sentence).asScala >>> ..... >>> >>> Call : >>> >>> scala> val output = input.select(ssplit2($"text",t >>> rue,true,2).as('words)) >>> <console>:40: error: type mismatch; >>> found : Boolean(true) >>> required: org.apache.spark.sql.Column >>> val output = input.select(ssplit2($"text",t >>> rue,true,2).as('words)) >>> ^ >>> <console>:40: error: type mismatch; >>> found : Boolean(true) >>> required: org.apache.spark.sql.Column >>> val output = input.select(ssplit2($"text",t >>> rue,true,2).as('words)) >>> ^ >>> <console>:40: error: type mismatch; >>> found : Int(2) >>> required: org.apache.spark.sql.Column >>> val output = input.select(ssplit2($"text",t >>> rue,true,2).as('words)) >>> ^ >>> >>> scala> val output = input.select(ssplit2($"text",$ >>> "true",$"true",$"2").as('words)) >>> org.apache.spark.sql.AnalysisException: cannot resolve '`true`' given >>> input columns: [id, text];; >>> 'Project [UDF(text#6, 'true, 'true, '2) AS words#16] >>> +- Project [_1#2 AS id#5, _2#3 AS text#6] >>> +- LocalRelation [_1#2, _2#3] >>> >>> I need help!! >>> >>> >>> 2017-06-16 >>> ------------------------------ >>> lk_spark >>> >> >>