import org.apache.spark.sql.functions.* callUDF("MyUDF", col("col1"), col("col2"))
On Fri, Oct 2, 2015 at 6:25 AM, unk1102 <umesh.ka...@gmail.com> wrote: > Hi I have registed my hive UDF using the following code: > > hiveContext.udf().register("MyUDF",new UDF1(String,String)) { > public String call(String o) throws Execption { > //bla bla > } > },DataTypes.String); > > Now I want to use above MyUDF in DataFrame. How do we use it? I know how to > use it in a sql and it works fine > > hiveContext.sql(select MyUDF("test") from myTable); > > My hiveContext.sql() query involves group by on multiple columns so for > scaling purpose I am trying to convert this query into DataFrame APIs > > > dataframe.select("col1","col2","coln").groupby(""col1","col2","coln").count(); > > Can we do the follwing dataframe.select(MyUDF("col1"))??? Please guide. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-registered-Hive-UDF-in-Spark-DataFrame-tp24907.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >