Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20900 @mstewart141, just to be clear, the error: ``` ValueError: Function has keyword-only parameters or annotations, use getfullargspec() API which can support them ``` is from deprecated `getargspec` instead of `getfullargspec` that's fixed by you. Current error seems like this: ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/.../spark/python/pyspark/sql/functions.py", line 2380, in pandas_udf return _create_udf(f=f, returnType=return_type, evalType=eval_type) File "/.../spark/python/pyspark/sql/udf.py", line 51, in _create_udf argspec = _get_argspec(f) File "/.../spark/python/pyspark/util.py", line 60, in _get_argspec argspec = inspect.getargspec(f) File "/usr/local/Cellar/python/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py", line 818, in getargspec raise TypeError('{!r} is not a Python function'.format(func)) TypeError: <functools.partial object at 0x1117dccb0> is not a Python function ``` with the reproducer below: ```python from functools import partial from pyspark.sql.functions import pandas_udf def test_func(a, b): return a + b pandas_udf(partial(test_func, b='id'), "string") ``` I think this should work like a normal udf ```python from functools import partial from pyspark.sql.functions import udf def test_func(a, b): return a + b normal_udf = udf(partial(test_func, b='id'), "string") df = spark.createDataFrame([["a"]]) df.select(normal_udf("_1")).show() ``` So, I think we should add the support for callable objects / partial functions in Pandas UDFs. Would you be interested in filling JIRA(s) and proceeding? If you are busy, I am willing to do it as well.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org