The behavior change is not good... On Thu, Jun 14, 2018 at 9:05 AM Li Jin <ice.xell...@gmail.com> wrote:
> Ah, looks like it's this change: > > https://github.com/apache/spark/commit/b3417b731d4e323398a0d7ec6e86405f4464f4f9#diff-3b5463566251d5b09fd328738a9e9bc5 > > It seems strange that by default Spark doesn't build with Hive but by > default PySpark requires it... > > This might also be a behavior change to PySpark users that build Spark > without Hive. The old behavior is "fall back to non-hive support" and the > new behavior is "program won't start". > > On Thu, Jun 14, 2018 at 11:51 AM, Sean Owen <sro...@gmail.com> wrote: > >> I think you would have to build with the 'hive' profile? but if so that >> would have been true for a while now. >> >> >> On Thu, Jun 14, 2018 at 10:38 AM Li Jin <ice.xell...@gmail.com> wrote: >> >>> Hey all, >>> >>> I just did a clean checkout of github.com/apache/spark but failed to >>> start PySpark, this is what I did: >>> >>> git clone g...@github.com:apache/spark.git; cd spark; build/sbt package; >>> bin/pyspark >>> >>> And got this exception: >>> >>> (spark-dev) Lis-MacBook-Pro:spark icexelloss$ bin/pyspark >>> >>> Python 3.6.3 |Anaconda, Inc.| (default, Nov 8 2017, 18:10:31) >>> >>> [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin >>> >>> Type "help", "copyright", "credits" or "license" for more information. >>> >>> 18/06/14 11:34:14 WARN NativeCodeLoader: Unable to load native-hadoop >>> library for your platform... using builtin-java classes where applicable >>> >>> Using Spark's default log4j profile: >>> org/apache/spark/log4j-defaults.properties >>> >>> Setting default log level to "WARN". >>> >>> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use >>> setLogLevel(newLevel). >>> >>> /Users/icexelloss/workspace/upstream2/spark/python/pyspark/shell.py:45: >>> UserWarning: Failed to initialize Spark session. >>> >>> warnings.warn("Failed to initialize Spark session.") >>> >>> Traceback (most recent call last): >>> >>> File >>> "/Users/icexelloss/workspace/upstream2/spark/python/pyspark/shell.py", line >>> 41, in <module> >>> >>> spark = SparkSession._create_shell_session() >>> >>> File >>> "/Users/icexelloss/workspace/upstream2/spark/python/pyspark/sql/session.py", >>> line 564, in _create_shell_session >>> >>> SparkContext._jvm.org.apache.hadoop.hive.conf.HiveConf() >>> >>> TypeError: 'JavaPackage' object is not callable >>> >>> I also tried to delete hadoop deps from my ivy2 cache and reinstall them >>> but no luck. I wonder: >>> >>> >>> 1. I have not seen this before, could this be caused by recent >>> change to head? >>> 2. Am I doing something wrong in the build process? >>> >>> >>> Thanks much! >>> Li >>> >>> >