[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-27 Thread mstewart141
Github user mstewart141 commented on the issue: https://github.com/apache/spark/pull/20900 @icexelloss as a daily user of `pandas_udf`, the inability to use keyword arguments, and the difficulties around default arguments (due in part to the magic that converts string arguments to

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-27 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/20900 Created https://issues.apache.org/jira/browse/SPARK-23800 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-27 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/20900 @HyukjinKwon Thanks for the explanation. I will create Jira for partial functions and callable objects in Pandas UDF. I am happy to take a look at it. ---

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20900 The issue itself here (SPARK-23645) describes kwargs arguments support in both UDF and Pandas UDF on calling side. Seems not working but the fix looks going to be quite invasive and big. So, I

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20900 to be clear, I think both functions below ```python class F(object): def __call__(...): ... func = F() ``` ```python def naive_func(a,

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20900 @icexelloss, yup ^ is correct. IIRC, we have some tests for normal udfs with callable objects and partial functions separately but seems the problem is in Pandas UDF. I think the fix itself

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-26 Thread mstewart141
Github user mstewart141 commented on the issue: https://github.com/apache/spark/pull/20900 Partials (and callable objects) are supported in UDF but not `pandas_udf`; kw args are not supported by either. --- - To

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-26 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/20900 Thank you @mstewart141 for looking into this. @HyukjinKwon should we open Jira for supporting kw args and partial functions in python UDFs? If I understand correctly, this is related to

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20900 Merged to master and branch-2.3 anyway. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20900 I think we should generally make everything works in both Python 2 and Python 3 but I want to know if there are special chases that I am missing too if there are any. ---

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20900 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20900 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88573/ Test PASSed. ---

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20900 **[Test build #88573 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88573/testReport)** for PR 20900 at commit

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-25 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20900 > One general question: how do we tend to think about the py2/3 split for api quirks/features? Must everything that is added for py3 also be functional in py2? ideally, is there

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-25 Thread mstewart141
Github user mstewart141 commented on the issue: https://github.com/apache/spark/pull/20900 Many (though not all, I don't think `callable`s are impacted) of the limitations of pandas_udf relative to UDF in this domain are due to the fact that `pandas_udf` doesn't allow for keyword

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20900 **[Test build #88573 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88573/testReport)** for PR 20900 at commit

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20900 LGTM except https://github.com/apache/spark/pull/20900#discussion_r176930776 --- - To unsubscribe, e-mail:

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20900 From a very quick look for the case "Try to be sneaky and don't use keywords with partial:". Seems it's due to type mismatch. This seems working fine (in Python 3): ``` >>>

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20900 @mstewart141, just to be clear, the error: ``` ValueError: Function has keyword-only parameters or annotations, use getfullargspec() API which can support them ``` is

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20900 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88566/ Test PASSed. ---

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20900 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20900 **[Test build #88566 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88566/testReport)** for PR 20900 at commit

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20900 **[Test build #88566 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88566/testReport)** for PR 20900 at commit

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20900 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-24 Thread mstewart141
Github user mstewart141 commented on the issue: https://github.com/apache/spark/pull/20900 @HyukjinKwon the old pr: https://github.com/apache/spark/pull/20798 was a disaster from a git-cleanliness perspective so i've updated here. ---

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20900 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

2018-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20900 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional