Sounds good. Thanks all for the quick reply. https://issues.apache.org/jira/browse/SPARK-24563
On Thu, Jun 14, 2018 at 12:19 PM, Xiao Li <gatorsm...@gmail.com> wrote: > Thanks for catching this. Please feel free to submit a PR. I do not think > Vanzin wants to introduce the behavior changes in that PR. We should do the > code review more carefully. > > Xiao > > 2018-06-14 9:18 GMT-07:00 Li Jin <ice.xell...@gmail.com>: > >> Are there objection to restore the behavior for PySpark users? I am happy >> to submit a patch. >> >> On Thu, Jun 14, 2018 at 12:15 PM Reynold Xin <r...@databricks.com> wrote: >> >>> The behavior change is not good... >>> >>> On Thu, Jun 14, 2018 at 9:05 AM Li Jin <ice.xell...@gmail.com> wrote: >>> >>>> Ah, looks like it's this change: >>>> https://github.com/apache/spark/commit/b3417b731d4e323398a0d >>>> 7ec6e86405f4464f4f9#diff-3b5463566251d5b09fd328738a9e9bc5 >>>> >>>> It seems strange that by default Spark doesn't build with Hive but by >>>> default PySpark requires it... >>>> >>>> This might also be a behavior change to PySpark users that build Spark >>>> without Hive. The old behavior is "fall back to non-hive support" and the >>>> new behavior is "program won't start". >>>> >>>> On Thu, Jun 14, 2018 at 11:51 AM, Sean Owen <sro...@gmail.com> wrote: >>>> >>>>> I think you would have to build with the 'hive' profile? but if so >>>>> that would have been true for a while now. >>>>> >>>>> >>>>> On Thu, Jun 14, 2018 at 10:38 AM Li Jin <ice.xell...@gmail.com> wrote: >>>>> >>>>>> Hey all, >>>>>> >>>>>> I just did a clean checkout of github.com/apache/spark but failed to >>>>>> start PySpark, this is what I did: >>>>>> >>>>>> git clone g...@github.com:apache/spark.git; cd spark; build/sbt >>>>>> package; bin/pyspark >>>>>> >>>>>> And got this exception: >>>>>> >>>>>> (spark-dev) Lis-MacBook-Pro:spark icexelloss$ bin/pyspark >>>>>> >>>>>> Python 3.6.3 |Anaconda, Inc.| (default, Nov 8 2017, 18:10:31) >>>>>> >>>>>> [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin >>>>>> >>>>>> Type "help", "copyright", "credits" or "license" for more information. >>>>>> >>>>>> 18/06/14 11:34:14 WARN NativeCodeLoader: Unable to load native-hadoop >>>>>> library for your platform... using builtin-java classes where applicable >>>>>> >>>>>> Using Spark's default log4j profile: org/apache/spark/log4j-default >>>>>> s.properties >>>>>> >>>>>> Setting default log level to "WARN". >>>>>> >>>>>> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use >>>>>> setLogLevel(newLevel). >>>>>> >>>>>> /Users/icexelloss/workspace/upstream2/spark/python/pyspark/shell.py:45: >>>>>> UserWarning: Failed to initialize Spark session. >>>>>> >>>>>> warnings.warn("Failed to initialize Spark session.") >>>>>> >>>>>> Traceback (most recent call last): >>>>>> >>>>>> File >>>>>> "/Users/icexelloss/workspace/upstream2/spark/python/pyspark/shell.py", >>>>>> line 41, in <module> >>>>>> >>>>>> spark = SparkSession._create_shell_session() >>>>>> >>>>>> File >>>>>> "/Users/icexelloss/workspace/upstream2/spark/python/pyspark/sql/session.py", >>>>>> line 564, in _create_shell_session >>>>>> >>>>>> SparkContext._jvm.org.apache.hadoop.hive.conf.HiveConf() >>>>>> >>>>>> TypeError: 'JavaPackage' object is not callable >>>>>> >>>>>> I also tried to delete hadoop deps from my ivy2 cache and reinstall >>>>>> them but no luck. I wonder: >>>>>> >>>>>> >>>>>> 1. I have not seen this before, could this be caused by recent >>>>>> change to head? >>>>>> 2. Am I doing something wrong in the build process? >>>>>> >>>>>> >>>>>> Thanks much! >>>>>> Li >>>>>> >>>>>> >>>> >