UDAF support for DataFrames in Spark 1.5.0?

Richard Cobbe Thu, 18 Feb 2016 08:32:18 -0800

I'm working on an application using DataFrames (Scala API) in Spark 1.5.0,
and we need to define and use several custom aggregators.  I'm having
trouble figuring out how to do this, however.


First, which version of Spark did UDAF support land in?  Has it in fact
landed at all?

https://issues.apache.org/jira/browse/SPARK-3947 suggests that UDAFs should
be available in 1.5.0.  However, the associated pull request includes
classes like org.apache.spark.sql.UDAFRegistration, but these classes don't
appear in the API docs, and I'm not able to use them from the spark shell
("type UDAFRegistration is not a member of package org.apache.spark.sql").

I don't have access to a Spark 1.6.0 installation, but UDAFRegistration
doesn't appear in the Scaladoc pages for 1.6.

Second, assuming that this functionality is supported in some version of
Spark, could someone point me to some documentation or an example that
demonstrates how to define and use a custom aggregation function?

Many thanks,

Richard

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

UDAF support for DataFrames in Spark 1.5.0?

Reply via email to