[GitHub] spark pull request #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work wi...

2018-03-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20728


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work wi...

2018-03-04 Thread mstewart141
Github user mstewart141 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20728#discussion_r172063118
  
--- Diff: python/pyspark/sql/udf.py ---
@@ -42,10 +42,15 @@ def _create_udf(f, returnType, evalType):
 PythonEvalType.SQL_GROUPED_AGG_PANDAS_UDF):
 
 import inspect
+import sys
 from pyspark.sql.utils import require_minimum_pyarrow_version
 
 require_minimum_pyarrow_version()
-argspec = inspect.getargspec(f)
+
+if sys.version_info[0] < 3:
+argspec = inspect.getargspec(f)
+else:
+argspec = inspect.getfullargspec(f)
--- End diff --

can do.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work wi...

2018-03-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20728#discussion_r172043971
  
--- Diff: python/pyspark/sql/udf.py ---
@@ -42,10 +42,15 @@ def _create_udf(f, returnType, evalType):
 PythonEvalType.SQL_GROUPED_AGG_PANDAS_UDF):
 
 import inspect
+import sys
 from pyspark.sql.utils import require_minimum_pyarrow_version
 
 require_minimum_pyarrow_version()
-argspec = inspect.getargspec(f)
+
+if sys.version_info[0] < 3:
+argspec = inspect.getargspec(f)
+else:
+argspec = inspect.getfullargspec(f)
--- End diff --

Shall we add a small comment while we are here like `'getargspec ' is 
deprecated since version 3.0 and calling it with type hints causes an actual 
issue. See SPARK-23569`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work wi...

2018-03-03 Thread mstewart141
GitHub user mstewart141 opened a pull request:

https://github.com/apache/spark/pull/20728

[SPARK-23569][PYTHON] Allow pandas_udf to work with python3 style 
type-annotated functions

## What changes were proposed in this pull request?

Check python version to determine whether to use `inspect.getargspec` or 
inspect.getfullargspec` before applying `pandas_udf` core logic to a function. 
The former is python2.7 (deprecated in python3) and the latter is python3.x. 
The latter correctly accounts for type annotations, which are syntax errors in 
python2.x.

## How was this patch tested?

Locally, on python 2.7 and 3.6.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mstewart141/spark pandas_udf_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20728.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20728


commit 3cd53f39f23ebd1b9b4134a9ac22348b301f8bd4
Author: Michael (Stu) Stewart 
Date:   2018-03-03T21:54:53Z

[SPARK-23569][PYTHON] Allow pandas_udf to work with python3 style 
type-annotated functions




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org