GitHub user BryanCutler opened a pull request: https://github.com/apache/spark/pull/19325
[SPARK--22106][PYSPARK][SQL] Disable 0-parameter pandas_udf and add doctests ## What changes were proposed in this pull request? This change disables the use of 0-parameter pandas_udfs due to the API being overly complex and awkward, and can easily be worked around by using an index column as an input argument. Also added doctests for pandas_udfs which revealed bugs for handling empty partitions and using the pandas_udf decorator. ## How was this patch tested? Reworked existing 0-parameter test to verify error is raised, added doctest for pandas_udf, added new tests for empty partition and decorator usage. You can merge this pull request into a Git repository by running: $ git pull https://github.com/BryanCutler/spark arrow-pandas_udf-0-param-remove-SPARK-22106 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19325.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19325 ---- commit c0eec8d2484a3aa2b9a4c5f6d7fb32125f33f623 Author: Bryan Cutler <cutl...@gmail.com> Date: 2017-09-22T18:08:58Z disabled support for 0-parameter pandas_udfs commit 7b0da106fb64a16b77c62953bb12548fda3f7ef3 Author: Bryan Cutler <cutl...@gmail.com> Date: 2017-09-22T20:11:02Z added doctests, fix for decorator and empty partition ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org