[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-05 Thread icexelloss
Github user icexelloss commented on the issue:

https://github.com/apache/spark/pull/20728
  
LGTM too


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20728
  
Merged to master and branch-2.3.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-04 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/20728
  
LGTM.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20728
  
Will merge this one if there are no more comments in few days.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20728
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20728
  
**[Test build #87948 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87948/testReport)**
 for PR 20728 at commit 
[`0395690`](https://github.com/apache/spark/commit/0395690d8d2c719d306c46a08a7a2faf8469ecb9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20728
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87948/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20728
  
**[Test build #87948 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87948/testReport)**
 for PR 20728 at commit 
[`0395690`](https://github.com/apache/spark/commit/0395690d8d2c719d306c46a08a7a2faf8469ecb9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-04 Thread mstewart141
Github user mstewart141 commented on the issue:

https://github.com/apache/spark/pull/20728
  
your test definitely makes sense; yea the syntax error in py2 part is why i 
wasn't sure how to go about testing this in the first place. this certainly 
gets the job done.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20728
  
cc @ueshin, @BryanCutler, @icexelloss FYI.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20728
  
I was just double checking if we can write a test. Mind adding the test 
below if it makes sense?

```diff
diff --git a/python/pyspark/sql/tests.py b/python/pyspark/sql/tests.py
index 19653072ea3..c46423ac905 100644
--- a/python/pyspark/sql/tests.py
+++ b/python/pyspark/sql/tests.py
@@ -4381,6 +4381,24 @@ class ScalarPandasUDFTests(ReusedSQLTestCase):
 result = df.withColumn('time', foo_udf(df.time))
 self.assertEquals(df.collect(), result.collect())

+@unittest.skipIf(sys.version_info[:2] < (3, 5), "Type hints are 
supported from Python 3.5.")
+def test_type_annotation(self):
+from pyspark.sql.functions import pandas_udf
+# Regression test to check if type hints can be used. See 
SPARK-23569.
+# Note that it throws an error during compilation in lower Python 
versions if 'exec'
+# is not used. Also, note that we explicitly use another 
dictionary to avoid modifications
+# in the current 'locals()'.
+#
+# Hyukjin: I think it's an ugly way to test issues about syntax 
specific in
+# higher versions of Python, which we shouldn't encourage. This 
was the last resort
+# I could come up with at that time.
+_locals = {}
+exec(
+"import pandas as pd\ndef _noop(col: pd.Series) -> pd.Series: 
return col",
+_locals)
+df = self.spark.range(1).select(pandas_udf(f=_locals['_noop'], 
returnType='bigint')('id'))
+self.assertEqual(df.first()[0], 0)
+

 @unittest.skipIf(
 not _have_pandas or not _have_pyarrow,
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20728
  
Usually I leave it open for few days so that I or other reviewers can check 
this change. I or other reviewers will leave some review comments, or leave an 
approval on this PR if it looks good without additional changes. Will try to 
guide you explicitly here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-03 Thread mstewart141
Github user mstewart141 commented on the issue:

https://github.com/apache/spark/pull/20728
  
what should next step be here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20728
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87938/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20728
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20728
  
**[Test build #87938 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87938/testReport)**
 for PR 20728 at commit 
[`3cd53f3`](https://github.com/apache/spark/commit/3cd53f39f23ebd1b9b4134a9ac22348b301f8bd4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20728
  
**[Test build #87938 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87938/testReport)**
 for PR 20728 at commit 
[`3cd53f3`](https://github.com/apache/spark/commit/3cd53f39f23ebd1b9b4134a9ac22348b301f8bd4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20728
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-03 Thread mstewart141
Github user mstewart141 commented on the issue:

https://github.com/apache/spark/pull/20728
  
cc @HyukjinKwon 
👍 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20728
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...

2018-03-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20728
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org