Jay: It would be nice if you can patch Spark with below PR and give it a try.
Thanks On Wed, Feb 3, 2016 at 6:03 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Created a pull request: > https://github.com/apache/spark/pull/11066 > > FYI > > On Wed, Feb 3, 2016 at 1:27 PM, Shipper, Jay [USA] <shipper_...@bah.com> > wrote: > >> It was just renamed recently: https://github.com/apache/spark/pull/10981 >> >> As SessionState is entirely managed by Spark’s code, it still seems like >> this is a bug with Spark 1.6.0, and not with how our application is using >> HiveContext. But I’d feel more confident filing a bug if someone else >> could confirm they’re having this issue with Spark 1.6.0. Ideally, we >> should also have some simple proof of concept that can be posted with the >> bug. >> >> From: Ted Yu <yuzhih...@gmail.com> >> Date: Wednesday, February 3, 2016 at 3:57 PM >> To: Jay Shipper <shipper_...@bah.com> >> Cc: "user@spark.apache.org" <user@spark.apache.org> >> Subject: Re: [External] Re: Spark 1.6.0 HiveContext NPE >> >> In ClientWrapper.scala, the SessionState.get().getConf call might have >> been executed ahead of SessionState.start(state) at line 194. >> >> This was the JIRA: >> >> [SPARK-10810] [SPARK-10902] [SQL] Improve session management in SQL >> >> In master branch, there is no more ClientWrapper.scala >> >> FYI >> >> On Wed, Feb 3, 2016 at 11:15 AM, Shipper, Jay [USA] <shipper_...@bah.com> >> wrote: >> >>> One quick update on this: The NPE is not happening with Spark 1.5.2, so >>> this problem seems specific to Spark 1.6.0. >>> >>> From: Jay Shipper <shipper_...@bah.com> >>> Date: Wednesday, February 3, 2016 at 12:06 PM >>> To: "user@spark.apache.org" <user@spark.apache.org> >>> Subject: Re: [External] Re: Spark 1.6.0 HiveContext NPE >>> >>> Right, I could already tell that from the stack trace and looking at >>> Spark’s code. What I’m trying to determine is why that’s coming back as >>> null now, just from upgrading Spark to 1.6.0. >>> >>> From: Ted Yu <yuzhih...@gmail.com> >>> Date: Wednesday, February 3, 2016 at 12:04 PM >>> To: Jay Shipper <shipper_...@bah.com> >>> Cc: "user@spark.apache.org" <user@spark.apache.org> >>> Subject: [External] Re: Spark 1.6.0 HiveContext NPE >>> >>> Looks like the NPE came from this line: >>> def conf: HiveConf = SessionState.get().getConf >>> >>> Meaning SessionState.get() returned null. >>> >>> On Wed, Feb 3, 2016 at 8:33 AM, Shipper, Jay [USA] <shipper_...@bah.com> >>> wrote: >>> >>>> I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m >>>> getting a NullPointerException from HiveContext. It’s happening while it >>>> tries to load some tables via JDBC from an external database (not Hive), >>>> using context.read().jdbc(): >>>> >>>> — >>>> java.lang.NullPointerException >>>> at >>>> org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205) >>>> at >>>> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552) >>>> at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551) >>>> at >>>> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538) >>>> at >>>> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537) >>>> at >>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) >>>> at >>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) >>>> at scala.collection.immutable.List.foreach(List.scala:318) >>>> at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) >>>> at scala.collection.AbstractTraversable.map(Traversable.scala:105) >>>> at >>>> org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537) >>>> at >>>> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250) >>>> at >>>> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237) >>>> at >>>> org.apache.spark.sql.hive.HiveContext$$anon$2.<init>(HiveContext.scala:457) >>>> at >>>> org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457) >>>> at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456) >>>> at >>>> org.apache.spark.sql.hive.HiveContext$$anon$3.<init>(HiveContext.scala:473) >>>> at >>>> org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473) >>>> at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472) >>>> at >>>> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34) >>>> at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133) >>>> at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52) >>>> at >>>> org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442) >>>> at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223) >>>> at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146) >>>> — >>>> >>>> Even though the application is not using Hive, HiveContext is used >>>> instead of SQLContext, for the additional functionality it provides. >>>> There’s no hive-site.xml for the application, but this did not cause an >>>> issue for Spark 1.4.1. >>>> >>>> Does anyone have an idea about what’s changed from 1.4.1 to 1.6.0 that >>>> could explain this NPE? The only obvious change I’ve noticed for >>>> HiveContext is that the default warehouse location is different (1.4.1 - >>>> current directory, 1.6.0 - /user/hive/warehouse), but I verified that this >>>> NPE happens even when /user/hive/warehouse exists and is readable/writeable >>>> for the application. In terms of changes to the application to work with >>>> Spark 1.6.0, the only one that might be relevant to this issue is the >>>> upgrade in the Hadoop dependencies to match what Spark 1.6.0 uses >>>> (2.6.0-cdh5.7.0-SNAPSHOT). >>>> >>>> Thanks, >>>> Jay >>>> >>> >>> >> >