GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/19517
[SPARK-20396][SQL][PySpark][FOLLOW-UP] groupby().apply() with pandas udf ## What changes were proposed in this pull request? This is a follow-up of #18732. This pr modifies `GroupedData.apply()` method to convert pandas udf to grouped udf implicitly. ## How was this patch tested? Exisiting tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ueshin/apache-spark issues/SPARK-20396/fup2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19517.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19517 ---- commit 4d2bd959e1eeabb4f72cfbb52a374ce721030507 Author: Takuya UESHIN <ues...@databricks.com> Date: 2017-10-16T06:45:55Z Introduce `@pandas_grouped_udf` decorator for grouped vectorized UDF. commit f0968702038e11c9c9a8f305c61f72d3f9e00f9a Author: Takuya UESHIN <ues...@databricks.com> Date: 2017-10-16T08:03:30Z Use PythonUdfType instead of vectorized and grouped. commit 639af2cee77456271d5f2f536d4712ab8e01a89d Author: Takuya UESHIN <ues...@databricks.com> Date: 2017-10-16T13:42:58Z Update an error message. commit 10512a64a9560eee6d3f65802abd042dedf0cafb Author: Takuya UESHIN <ues...@databricks.com> Date: 2017-10-16T13:43:51Z Add a test to use data type string. commit 789e642763ab4f59e14137fcc75b514223bc7aae Author: Takuya UESHIN <ues...@databricks.com> Date: 2017-10-16T14:13:43Z Restrict the number of arguments for grouped udf to only 1. commit 122a7bccaff11def2c12cfccdd00244394ed3478 Author: Takuya UESHIN <ues...@databricks.com> Date: 2017-10-16T16:24:03Z Restrict checking the number of arguments. commit fdafb3561d44ca2583380b7aeaf7843ce5285b1e Author: Takuya UESHIN <ues...@databricks.com> Date: 2017-10-16T16:54:23Z Revert "Restrict checking the number of arguments." This reverts commit 122a7bccaff11def2c12cfccdd00244394ed3478. commit 94d05f4f8d5c663319ec12668dbd1206ffa2e83a Author: Takuya UESHIN <ues...@databricks.com> Date: 2017-10-16T18:10:50Z Address comments. commit 733296951b45d760aa0a8465eb0189077ea67372 Author: Takuya UESHIN <ues...@databricks.com> Date: 2017-10-16T18:33:08Z Add tests for unsupported type. commit 85f250d0eda56606a599c5fb15046ef0fd63a3c4 Author: Takuya UESHIN <ues...@databricks.com> Date: 2017-10-17T04:59:34Z Address a comment. commit 7b386c4be48c0a2e8de6f04cf341de13e8e98444 Author: Takuya UESHIN <ues...@databricks.com> Date: 2017-10-17T14:12:37Z Remove `@pandas_grouped_udf` and convert implicitly. ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org