Github user BryanCutler commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19325#discussion_r140835837
  
    --- Diff: python/pyspark/worker.py ---
    @@ -80,14 +77,12 @@ def wrap_pandas_udf(f, return_type):
         arrow_return_type = toArrowType(return_type)
     
         def verify_result_length(*a):
    -        kwargs = a[-1]
    -        result = f(*a[:-1], **kwargs)
    -        if len(result) != kwargs["length"]:
    +        result = f(*a)
    +        if len(result) != len(a[0]):
    --- End diff --
    
    Good point. We should probably have a test that returns a scalar value too. 
I'm not sure we should limit the return type so much.  As long as pyarrow can 
consume it, then it should be ok - it can also take a numpy array which might 
be useful. Otherwise it should raise a clear exception.  Maybe checking that it 
has `__len__` is good enough?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to