[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/20728 LGTM too --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20728 Merged to master and branch-2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20728 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20728 Will merge this one if there are no more comments in few days. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20728 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20728 **[Test build #87948 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87948/testReport)** for PR 20728 at commit [`0395690`](https://github.com/apache/spark/commit/0395690d8d2c719d306c46a08a7a2faf8469ecb9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20728 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87948/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20728 **[Test build #87948 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87948/testReport)** for PR 20728 at commit [`0395690`](https://github.com/apache/spark/commit/0395690d8d2c719d306c46a08a7a2faf8469ecb9). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user mstewart141 commented on the issue: https://github.com/apache/spark/pull/20728 your test definitely makes sense; yea the syntax error in py2 part is why i wasn't sure how to go about testing this in the first place. this certainly gets the job done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20728 cc @ueshin, @BryanCutler, @icexelloss FYI. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20728 I was just double checking if we can write a test. Mind adding the test below if it makes sense? ```diff diff --git a/python/pyspark/sql/tests.py b/python/pyspark/sql/tests.py index 19653072ea3..c46423ac905 100644 --- a/python/pyspark/sql/tests.py +++ b/python/pyspark/sql/tests.py @@ -4381,6 +4381,24 @@ class ScalarPandasUDFTests(ReusedSQLTestCase): result = df.withColumn('time', foo_udf(df.time)) self.assertEquals(df.collect(), result.collect()) +@unittest.skipIf(sys.version_info[:2] < (3, 5), "Type hints are supported from Python 3.5.") +def test_type_annotation(self): +from pyspark.sql.functions import pandas_udf +# Regression test to check if type hints can be used. See SPARK-23569. +# Note that it throws an error during compilation in lower Python versions if 'exec' +# is not used. Also, note that we explicitly use another dictionary to avoid modifications +# in the current 'locals()'. +# +# Hyukjin: I think it's an ugly way to test issues about syntax specific in +# higher versions of Python, which we shouldn't encourage. This was the last resort +# I could come up with at that time. +_locals = {} +exec( +"import pandas as pd\ndef _noop(col: pd.Series) -> pd.Series: return col", +_locals) +df = self.spark.range(1).select(pandas_udf(f=_locals['_noop'], returnType='bigint')('id')) +self.assertEqual(df.first()[0], 0) + @unittest.skipIf( not _have_pandas or not _have_pyarrow, ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20728 Usually I leave it open for few days so that I or other reviewers can check this change. I or other reviewers will leave some review comments, or leave an approval on this PR if it looks good without additional changes. Will try to guide you explicitly here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user mstewart141 commented on the issue: https://github.com/apache/spark/pull/20728 what should next step be here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20728 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87938/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20728 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20728 **[Test build #87938 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87938/testReport)** for PR 20728 at commit [`3cd53f3`](https://github.com/apache/spark/commit/3cd53f39f23ebd1b9b4134a9ac22348b301f8bd4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20728 **[Test build #87938 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87938/testReport)** for PR 20728 at commit [`3cd53f3`](https://github.com/apache/spark/commit/3cd53f39f23ebd1b9b4134a9ac22348b301f8bd4). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20728 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user mstewart141 commented on the issue: https://github.com/apache/spark/pull/20728 cc @HyukjinKwon ð --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20728 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20728: [SPARK-23569][PYTHON] Allow pandas_udf to work with pyth...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20728 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org