Repository: spark Updated Branches: refs/heads/master fdadc4be0 -> d3eed8fd6
[SPARK-24563][PYTHON] Catch TypeError when testing existence of HiveConf when creating pysp⦠â¦ark shell ## What changes were proposed in this pull request? This PR catches TypeError when testing existence of HiveConf when creating pyspark shell ## How was this patch tested? Manually tested. Here are the manual test cases: Build with hive: ``` (pyarrow-dev) Lis-MacBook-Pro:spark icexelloss$ bin/pyspark Python 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 13:44:09) [GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin Type "help", "copyright", "credits" or "license" for more information. 18/06/14 14:55:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.4.0-SNAPSHOT /_/ Using Python version 3.6.5 (default, Apr 6 2018 13:44:09) SparkSession available as 'spark'. >>> spark.conf.get('spark.sql.catalogImplementation') 'hive' ``` Build without hive: ``` (pyarrow-dev) Lis-MacBook-Pro:spark icexelloss$ bin/pyspark Python 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 13:44:09) [GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin Type "help", "copyright", "credits" or "license" for more information. 18/06/14 15:04:52 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.4.0-SNAPSHOT /_/ Using Python version 3.6.5 (default, Apr 6 2018 13:44:09) SparkSession available as 'spark'. >>> spark.conf.get('spark.sql.catalogImplementation') 'in-memory' ``` Failed to start shell: ``` (pyarrow-dev) Lis-MacBook-Pro:spark icexelloss$ bin/pyspark Python 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 13:44:09) [GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin Type "help", "copyright", "credits" or "license" for more information. 18/06/14 15:07:53 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). /Users/icexelloss/workspace/spark/python/pyspark/shell.py:45: UserWarning: Failed to initialize Spark session. warnings.warn("Failed to initialize Spark session.") Traceback (most recent call last): File "/Users/icexelloss/workspace/spark/python/pyspark/shell.py", line 41, in <module> spark = SparkSession._create_shell_session() File "/Users/icexelloss/workspace/spark/python/pyspark/sql/session.py", line 581, in _create_shell_session return SparkSession.builder.getOrCreate() File "/Users/icexelloss/workspace/spark/python/pyspark/sql/session.py", line 168, in getOrCreate raise py4j.protocol.Py4JError("Fake Py4JError") py4j.protocol.Py4JError: Fake Py4JError (pyarrow-dev) Lis-MacBook-Pro:spark icexelloss$ ``` Author: Li Jin <ice.xell...@gmail.com> Closes #21569 from icexelloss/SPARK-24563-fix-pyspark-shell-without-hive. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d3eed8fd Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d3eed8fd Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d3eed8fd Branch: refs/heads/master Commit: d3eed8fd6d65d95306abfb513a9e0fde05b703ac Parents: fdadc4b Author: Li Jin <ice.xell...@gmail.com> Authored: Thu Jun 14 13:16:20 2018 -0700 Committer: Marcelo Vanzin <van...@cloudera.com> Committed: Thu Jun 14 13:16:20 2018 -0700 ---------------------------------------------------------------------- python/pyspark/sql/session.py | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/d3eed8fd/python/pyspark/sql/session.py ---------------------------------------------------------------------- diff --git a/python/pyspark/sql/session.py b/python/pyspark/sql/session.py index e880dd1..f1ad6b1 100644 --- a/python/pyspark/sql/session.py +++ b/python/pyspark/sql/session.py @@ -567,14 +567,7 @@ class SparkSession(object): .getOrCreate() else: return SparkSession.builder.getOrCreate() - except py4j.protocol.Py4JError: - if conf.get('spark.sql.catalogImplementation', '').lower() == 'hive': - warnings.warn("Fall back to non-hive support because failing to access HiveConf, " - "please make sure you build spark with hive") - - try: - return SparkSession.builder.getOrCreate() - except TypeError: + except (py4j.protocol.Py4JError, TypeError): if conf.get('spark.sql.catalogImplementation', '').lower() == 'hive': warnings.warn("Fall back to non-hive support because failing to access HiveConf, " "please make sure you build spark with hive") --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org