This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 30e1261 [SPARK-37121][HIVE][TEST] Fix Python version detection bug in TestUtils used by HiveExternalCatalogVersionsSuite 30e1261 is described below commit 30e126176a9e29f23d6a16407074200d377153d6 Author: Erik Krogen <xkro...@apache.org> AuthorDate: Thu Oct 28 09:46:01 2021 +0900 [SPARK-37121][HIVE][TEST] Fix Python version detection bug in TestUtils used by HiveExternalCatalogVersionsSuite ### What changes were proposed in this pull request? Fix a bug in `TestUtils.isPythonVersionAtLeast38` to allow for `HiveExternalCatalogVersionsSuite` to test against Spark 2.x releases in environments with Python <= 3.7. ### Why are the changes needed? The logic in `TestUtils.isPythonVersionAtLeast38` was added in #30044 to prevent Spark 2.4 from being run in an environment where the Python3 version installed was >= Python 3.8, which is not compatible with Spark 2.4. However, this method always returns true, so only Spark 3.x versions will ever be included in the version set for `HiveExternalCatalogVersionsSuite`, regardless of the system-installed version of Python. The problem is here: https://github.com/apache/spark/blob/951efb80856e2a92ba3690886c95643567dae9d0/core/src/main/scala/org/apache/spark/TestUtils.scala#L280-L291 It's trying to evaluate the version of Python using a `ProcessLogger`, but the logger accepts a `String => Unit` function, i.e., it does not make use of the return value in any way (since it's meant for logging). So the result of the `startsWith` checks are thrown away, and `attempt.isSuccess && attempt.get == 0` will always be true as long as your system has a `python3` binary (of any version). ### Does this PR introduce _any_ user-facing change? No, test changes only. ### How was this patch tested? Confirmed by checking that `HiveExternalCatalogVersionsSuite` downloads binary distros for Spark 2.x lines as well as 3.x when I symlink my `python3` to Python 3.7, and only downloads distros for the 3.x lines when I symlink my `python3` to Python 3.9. ```bash brew link --force python3.7 # run HiveExternalCatalogVersionsSuite and validate that 2.x and 3.x tests get executed brew unlink python3.7 brew link --force python3.9 # run HiveExternalCatalogVersionsSuite and validate that only 3.x tests get executed ``` Closes #34395 from xkrogen/xkrogen-SPARK-37121-testutils-python38-fix. Authored-by: Erik Krogen <xkro...@apache.org> Signed-off-by: Hyukjin Kwon <gurwls...@apache.org> --- core/src/main/scala/org/apache/spark/TestUtils.scala | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala b/core/src/main/scala/org/apache/spark/TestUtils.scala index 24e5534..65ef813 100644 --- a/core/src/main/scala/org/apache/spark/TestUtils.scala +++ b/core/src/main/scala/org/apache/spark/TestUtils.scala @@ -278,16 +278,9 @@ private[spark] object TestUtils { } def isPythonVersionAtLeast38(): Boolean = { - val attempt = if (Utils.isWindows) { - Try(Process(Seq("cmd.exe", "/C", "python3 --version")) - .run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) - .exitValue()) - } else { - Try(Process(Seq("sh", "-c", "python3 --version")) - .run(ProcessLogger(s => s.startsWith("Python 3.8") || s.startsWith("Python 3.9"))) - .exitValue()) - } - attempt.isSuccess && attempt.get == 0 + val cmdSeq = if (Utils.isWindows) Seq("cmd.exe", "/C") else Seq("sh", "-c") + val pythonSnippet = "import sys; sys.exit(sys.version_info < (3, 8, 0))" + Try(Process(cmdSeq :+ s"python3 -c '$pythonSnippet'").! == 0).getOrElse(false) } /** --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org