Github user mstewart141 commented on the issue:
https://github.com/apache/spark/pull/20900
@icexelloss as a daily user of `pandas_udf`, the inability to use keyword
arguments, and the difficulties around default arguments (due in part to the
magic that converts string arguments to
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/20900
Created https://issues.apache.org/jira/browse/SPARK-23800
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/20900
@HyukjinKwon Thanks for the explanation. I will create Jira for partial
functions and callable objects in Pandas UDF. I am happy to take a look at it.
---
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20900
The issue itself here (SPARK-23645) describes kwargs arguments support in
both UDF and Pandas UDF on calling side. Seems not working but the fix looks
going to be quite invasive and big. So, I
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20900
to be clear, I think both functions below
```python
class F(object):
def __call__(...):
...
func = F()
```
```python
def naive_func(a,
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20900
@icexelloss, yup ^ is correct. IIRC, we have some tests for normal udfs
with callable objects and partial functions separately but seems the problem is
in Pandas UDF. I think the fix itself
Github user mstewart141 commented on the issue:
https://github.com/apache/spark/pull/20900
Partials (and callable objects) are supported in UDF but not `pandas_udf`;
kw args are not supported by either.
---
-
To
Github user icexelloss commented on the issue:
https://github.com/apache/spark/pull/20900
Thank you @mstewart141 for looking into this.
@HyukjinKwon should we open Jira for supporting kw args and partial
functions in python UDFs? If I understand correctly, this is related to
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20900
Merged to master and branch-2.3 anyway.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20900
I think we should generally make everything works in both Python 2 and
Python 3 but I want to know if there are special chases that I am missing too
if there are any.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20900
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20900
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88573/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20900
**[Test build #88573 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88573/testReport)**
for PR 20900 at commit
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/20900
> One general question: how do we tend to think about the py2/3 split for
api quirks/features? Must everything that is added for py3 also be functional
in py2?
ideally, is there
Github user mstewart141 commented on the issue:
https://github.com/apache/spark/pull/20900
Many (though not all, I don't think `callable`s are impacted) of the
limitations of pandas_udf relative to UDF in this domain are due to the fact
that `pandas_udf` doesn't allow for keyword
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20900
**[Test build #88573 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88573/testReport)**
for PR 20900 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20900
LGTM except https://github.com/apache/spark/pull/20900#discussion_r176930776
---
-
To unsubscribe, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20900
From a very quick look for the case "Try to be sneaky and don't use
keywords with partial:".
Seems it's due to type mismatch. This seems working fine (in Python 3):
```
>>>
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20900
@mstewart141, just to be clear, the error:
```
ValueError: Function has keyword-only parameters or annotations, use
getfullargspec() API which can support them
```
is
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20900
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88566/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20900
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20900
**[Test build #88566 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88566/testReport)**
for PR 20900 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20900
**[Test build #88566 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88566/testReport)**
for PR 20900 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20900
ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user mstewart141 commented on the issue:
https://github.com/apache/spark/pull/20900
@HyukjinKwon the old pr: https://github.com/apache/spark/pull/20798
was a disaster from a git-cleanliness perspective so i've updated here.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20900
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20900
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
27 matches
Mail list logo