[GitHub] spark pull request #20830: [SPARK-23691][PYTHON] Use sql_conf util in PySpar...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20830 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20830: [SPARK-23691][PYTHON] Use sql_conf util in PySpar...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20830#discussion_r174991833 --- Diff: python/pyspark/sql/tests.py --- @@ -3551,12 +3544,12 @@ def test_null_conversion(self): self.assertTrue(all([c == 1 for c in null_counts])) def _toPandas_arrow_toggle(self, df): -self.spark.conf.set("spark.sql.execution.arrow.enabled", "false") -try: +with self.sql_conf({"spark.sql.execution.arrow.enabled": False}): pdf = df.toPandas() -finally: -self.spark.conf.set("spark.sql.execution.arrow.enabled", "true") -pdf_arrow = df.toPandas() + +with self.sql_conf({"spark.sql.execution.arrow.enabled": True}): --- End diff -- Ah, OK. I am fine. will omit this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20830: [SPARK-23691][PYTHON] Use sql_conf util in PySpar...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20830#discussion_r174991686 --- Diff: python/pyspark/sql/tests.py --- @@ -2461,17 +2461,13 @@ def test_join_without_on(self): df1 = self.spark.range(1).toDF("a") df2 = self.spark.range(1).toDF("b") -try: -self.spark.conf.set("spark.sql.crossJoin.enabled", "false") +with self.sql_conf({"spark.sql.crossJoin.enabled": False}): self.assertRaises(AnalysisException, lambda: df1.join(df2, how="inner").collect()) -self.spark.conf.set("spark.sql.crossJoin.enabled", "true") +with self.sql_conf({"spark.sql.crossJoin.enabled": True}): --- End diff -- Yup, it originally unset `spark.sql.crossJoin.enabled` but now it set to the original value back. If `spark.sql.crossJoin.enabled` is unset in this test, it will change this back to be unset. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20830: [SPARK-23691][PYTHON] Use sql_conf util in PySpar...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/20830#discussion_r174868549 --- Diff: python/pyspark/sql/tests.py --- @@ -2461,17 +2461,13 @@ def test_join_without_on(self): df1 = self.spark.range(1).toDF("a") df2 = self.spark.range(1).toDF("b") -try: -self.spark.conf.set("spark.sql.crossJoin.enabled", "false") +with self.sql_conf({"spark.sql.crossJoin.enabled": False}): self.assertRaises(AnalysisException, lambda: df1.join(df2, how="inner").collect()) -self.spark.conf.set("spark.sql.crossJoin.enabled", "true") +with self.sql_conf({"spark.sql.crossJoin.enabled": True}): --- End diff -- So the `sql_conf` context will change this back to be unset right? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20830: [SPARK-23691][PYTHON] Use sql_conf util in PySpar...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/20830#discussion_r174857693 --- Diff: python/pyspark/sql/tests.py --- @@ -3551,12 +3544,12 @@ def test_null_conversion(self): self.assertTrue(all([c == 1 for c in null_counts])) def _toPandas_arrow_toggle(self, df): -self.spark.conf.set("spark.sql.execution.arrow.enabled", "false") -try: +with self.sql_conf({"spark.sql.execution.arrow.enabled": False}): pdf = df.toPandas() -finally: -self.spark.conf.set("spark.sql.execution.arrow.enabled", "true") -pdf_arrow = df.toPandas() + +with self.sql_conf({"spark.sql.execution.arrow.enabled": True}): --- End diff -- We can omit this when we use the default value or set the value in setup method, but I'm okay if we want to show the value explicitly. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20830: [SPARK-23691][PYTHON] Use sql_conf util in PySpar...
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/20830 [SPARK-23691][PYTHON] Use sql_conf util in PySpark tests where possible ## What changes were proposed in this pull request? https://github.com/apache/spark/commit/d6632d185e147fcbe6724545488ad80dce20277e added an useful util ```python @contextmanager def sql_conf(self, pairs): ... ``` to allow configuration set/unset within a block: ```python with self.sql_conf({"spark.blah.blah.blah", "blah"}) # test codes ``` This PR proposes to use this util where possible in PySpark tests. Note that there look already few places affecting tests without restoring the original value back in unittest classes. ## How was this patch tested? Manually tested via: ``` ./run-tests --modules=pyspark-sql --python-executables=python2 ./run-tests --modules=pyspark-sql --python-executables=python3 ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark cleanup-sql-conf Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20830.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20830 commit 89cf69be7ae00f571c51de402067928b663c5a45 Author: hyukjinkwon Date: 2018-03-15T04:16:18Z Use sql_conf util in PySpark tests where possible --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org