The only UDAFs that we support today are those defined using the Hive UDAF API. Otherwise you'll have to drop into Spark operations. I'd suggest opening a JIRA.
On Tue, Mar 24, 2015 at 10:49 AM, jamborta <jambo...@gmail.com> wrote: > Hi all, > > I have been trying out the new dataframe api in 1.3, which looks great by > the way. > > I have found an example to define udfs and add them to select operations, > like this: > > slen = F.udf(lambda s: len(s), IntegerType()) > df.select(df.age, slen(df.name).alias('slen')).collect() > > is it possible to to something similar with aggregates? Something like > this: > > gdf = df.groupBy(df.name) > gdf.agg(slen(df.age)).collect() > > thanks, > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Dataframe-groupby-custom-functions-python-tp22205.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >