The only UDAFs that we support today are those defined using the Hive UDAF
API.  Otherwise you'll have to drop into Spark operations.  I'd suggest
opening a JIRA.

On Tue, Mar 24, 2015 at 10:49 AM, jamborta <jambo...@gmail.com> wrote:

> Hi all,
>
> I have been trying out the new dataframe api in 1.3, which looks great by
> the way.
>
> I have found an example to define udfs and add them to select operations,
> like this:
>
> slen = F.udf(lambda s: len(s), IntegerType())
> df.select(df.age, slen(df.name).alias('slen')).collect()
>
> is it possible to to something similar with aggregates? Something like
> this:
>
> gdf = df.groupBy(df.name)
> gdf.agg(slen(df.age)).collect()
>
> thanks,
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Dataframe-groupby-custom-functions-python-tp22205.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to