How to use registered Hive UDF in Spark DataFrame?

unk1102 Fri, 02 Oct 2015 04:26:19 -0700

Hi I have registed my hive UDF using the following code:

hiveContext.udf().register("MyUDF",new UDF1(String,String)) {
public String call(String o) throws Execption {
//bla bla
}
},DataTypes.String);


Now I want to use above MyUDF in DataFrame. How do we use it? I know how to
use it in a sql and it works fine

hiveContext.sql(select MyUDF("test") from myTable);

My hiveContext.sql() query involves group by on multiple columns so for
scaling purpose I am trying to convert this query into DataFrame APIs

dataframe.select("col1","col2","coln").groupby(""col1","col2","coln").count();

Can we do the follwing dataframe.select(MyUDF("col1"))??? Please guide.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-registered-Hive-UDF-in-Spark-DataFrame-tp24907.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

How to use registered Hive UDF in Spark DataFrame?

Reply via email to