GitHub user icexelloss opened a pull request: https://github.com/apache/spark/pull/20295
[SPARK-23011] Support alternative function form with group aggregate pandas UDF ## What changes were proposed in this pull request? This PR proposes to support an alternative function from with group aggregate pandas UDF. The current form: ``` def foo(pdf): return ... ``` Takes a single arg that is a pandas DataFrame. With this PR, an alternative form is supported: ``` def foo(key, pdf): return ... ``` The alternative form takes two argument - a tuple that presents the grouping key, and a pandas DataFrame represents the data. ## How was this patch tested? GroupbyApplyTests You can merge this pull request into a Git repository by running: $ git pull https://github.com/icexelloss/spark SPARK-23011-groupby-apply-key Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20295.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20295 ---- commit 38195ac7f0b9e1227cfc1e407de47e276b3fc43f Author: Li Jin <ice.xelloss@...> Date: 2018-01-17T18:37:18Z Initial commit. Test passes. ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org