[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

HyukjinKwon Sat, 24 Mar 2018 23:49:00 -0700

Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20900
  
    @mstewart141, just to be clear, the error:
    
    ```
    ValueError: Function has keyword-only parameters or annotations, use 
getfullargspec() API which can support them
    ```
    
    is from deprecated `getargspec` instead of `getfullargspec` that's fixed by 
you. Current error seems like this:
    
    ```
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/.../spark/python/pyspark/sql/functions.py", line 2380, in 
pandas_udf
        return _create_udf(f=f, returnType=return_type, evalType=eval_type)
      File "/.../spark/python/pyspark/sql/udf.py", line 51, in _create_udf
        argspec = _get_argspec(f)
      File "/.../spark/python/pyspark/util.py", line 60, in _get_argspec
        argspec = inspect.getargspec(f)
      File 
"/usr/local/Cellar/python/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/inspect.py",
 line 818, in getargspec
        raise TypeError('{!r} is not a Python function'.format(func))
    TypeError: <functools.partial object at 0x1117dccb0> is not a Python 
function
    ```
    
    with the reproducer below:
    
    ```python
    from functools import partial
    from pyspark.sql.functions import pandas_udf
    
    def test_func(a, b):
        return a + b
    
    pandas_udf(partial(test_func, b='id'), "string")
    ```
    
    I think this should work like a normal udf
    
    ```python
    from functools import partial
    from pyspark.sql.functions import udf
    
    def test_func(a, b):
        return a + b
    
    normal_udf = udf(partial(test_func, b='id'), "string")
    df = spark.createDataFrame([["a"]])
    df.select(normal_udf("_1")).show()
    ```
    
    So, I think we should add the support for callable objects / partial 
functions in Pandas UDFs. Would you be interested in filling JIRA(s) and 
proceeding? If you are busy, I am willing to do it as well.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20900: [SPARK-23645][MINOR][DOCS][PYTHON] Add docs RE `pandas_u...

Reply via email to