Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/19325#discussion_r140835837 --- Diff: python/pyspark/worker.py --- @@ -80,14 +77,12 @@ def wrap_pandas_udf(f, return_type): arrow_return_type = toArrowType(return_type) def verify_result_length(*a): - kwargs = a[-1] - result = f(*a[:-1], **kwargs) - if len(result) != kwargs["length"]: + result = f(*a) + if len(result) != len(a[0]): --- End diff -- Good point. We should probably have a test that returns a scalar value too. I'm not sure we should limit the return type so much. As long as pyarrow can consume it, then it should be ok - it can also take a numpy array which might be useful. Otherwise it should raise a clear exception. Maybe checking that it has `__len__` is good enough?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org