Re: How to use registered Hive UDF in Spark DataFrame?

Umesh Kacha Fri, 02 Oct 2015 13:30:35 -0700

Hi Michael,

Thanks much. How do we give alias name for resultant columns? For e.g. when
using


hiveContext.sql("select MyUDF("test") as mytest from myTable");

how do we do that in DataFrame callUDF

callUDF("MyUDF", col("col1"))???

On Fri, Oct 2, 2015 at 8:23 PM, Michael Armbrust <mich...@databricks.com>
wrote:

> import org.apache.spark.sql.functions.*
>
> callUDF("MyUDF", col("col1"), col("col2"))
>
> On Fri, Oct 2, 2015 at 6:25 AM, unk1102 <umesh.ka...@gmail.com> wrote:
>
>> Hi I have registed my hive UDF using the following code:
>>
>> hiveContext.udf().register("MyUDF",new UDF1(String,String)) {
>> public String call(String o) throws Execption {
>> //bla bla
>> }
>> },DataTypes.String);
>>
>> Now I want to use above MyUDF in DataFrame. How do we use it? I know how
>> to
>> use it in a sql and it works fine
>>
>> hiveContext.sql(select MyUDF("test") from myTable);
>>
>> My hiveContext.sql() query involves group by on multiple columns so for
>> scaling purpose I am trying to convert this query into DataFrame APIs
>>
>>
>> dataframe.select("col1","col2","coln").groupby(""col1","col2","coln").count();
>>
>> Can we do the follwing dataframe.select(MyUDF("col1"))??? Please guide.
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-registered-Hive-UDF-in-Spark-DataFrame-tp24907.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Re: How to use registered Hive UDF in Spark DataFrame?

Reply via email to