Re: [External] Re: Spark 1.6.0 HiveContext NPE

2016-02-08 Thread Shipper, Jay [USA]
I looked back into this today.  I made some changes last week to the 
application to allow for not only compatibility with Spark 1.5.2, but also 
backwards compatibility with Spark 1.4.1 (the version our current deployment 
uses).  The changes mostly involved changing dependencies from compile to 
provided scope, while also removing some conflicting dependencies with what’s 
bundled in the Spark assembled JAR, particularly Scala and SLF4J libraries.  
Now, the application works fine with Spark 1.6.0; the NPE is not occurring, no 
patch necessary.  So unfortunately, I won’t be able to help determine the root 
cause, as I cannot replicate this issue.

Thanks for your help.

From: Ted Yu <yuzhih...@gmail.com<mailto:yuzhih...@gmail.com>>
Date: Friday, February 5, 2016 at 5:40 PM
To: Jay Shipper <shipper_...@bah.com<mailto:shipper_...@bah.com>>
Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: [External] Re: Spark 1.6.0 HiveContext NPE

Was there any other exception(s) in the client log ?

Just want to find the cause for this NPE.

Thanks

On Wed, Feb 3, 2016 at 8:33 AM, Shipper, Jay [USA] 
<shipper_...@bah.com<mailto:shipper_...@bah.com>> wrote:
I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m getting a 
NullPointerException from HiveContext.  It’s happening while it tries to load 
some tables via JDBC from an external database (not Hive), using 
context.read().jdbc():

—
java.lang.NullPointerException
at org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
at 
org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
at org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:457)
at 
org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457)
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456)
at org.apache.spark.sql.hive.HiveContext$$anon$3.(HiveContext.scala:473)
at 
org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473)
at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472)
at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
at org.apache.spark.sql.DataFrame.(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
—

Even though the application is not using Hive, HiveContext is used instead of 
SQLContext, for the additional functionality it provides.  There’s no 
hive-site.xml for the application, but this did not cause an issue for Spark 
1.4.1.

Does anyone have an idea about what’s changed from 1.4.1 to 1.6.0 that could 
explain this NPE?  The only obvious change I’ve noticed for HiveContext is that 
the default warehouse location is different (1.4.1 - current directory, 1.6.0 - 
/user/hive/warehouse), but I verified that this NPE happens even when 
/user/hive/warehouse exists and is readable/writeable for the application.  In 
terms of changes to the application to work with Spark 1.6.0, the only one that 
might be relevant to this issue is the upgrade in the Hadoop dependencies to 
match what Spark 1.6.0 uses (2.6.0-cdh5.7.0-SNAPSHOT).

Thanks,
Jay



Re: Spark 1.6.0 HiveContext NPE

2016-02-05 Thread Ted Yu
Was there any other exception(s) in the client log ?

Just want to find the cause for this NPE.

Thanks

On Wed, Feb 3, 2016 at 8:33 AM, Shipper, Jay [USA] 
wrote:

> I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m
> getting a NullPointerException from HiveContext.  It’s happening while it
> tries to load some tables via JDBC from an external database (not Hive),
> using context.read().jdbc():
>
> —
> java.lang.NullPointerException
> at
> org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
> at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
> at
> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
> at
> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at scala.collection.immutable.List.foreach(List.scala:318)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
> at
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
> at
> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
> at
> org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:457)
> at
> org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457)
> at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456)
> at
> org.apache.spark.sql.hive.HiveContext$$anon$3.(HiveContext.scala:473)
> at
> org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473)
> at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472)
> at
> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
> at org.apache.spark.sql.DataFrame.(DataFrame.scala:133)
> at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
> at
> org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442)
> at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223)
> at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
> —
>
> Even though the application is not using Hive, HiveContext is used instead
> of SQLContext, for the additional functionality it provides.  There’s no
> hive-site.xml for the application, but this did not cause an issue for
> Spark 1.4.1.
>
> Does anyone have an idea about what’s changed from 1.4.1 to 1.6.0 that
> could explain this NPE?  The only obvious change I’ve noticed for
> HiveContext is that the default warehouse location is different (1.4.1 -
> current directory, 1.6.0 - /user/hive/warehouse), but I verified that this
> NPE happens even when /user/hive/warehouse exists and is readable/writeable
> for the application.  In terms of changes to the application to work with
> Spark 1.6.0, the only one that might be relevant to this issue is the
> upgrade in the Hadoop dependencies to match what Spark 1.6.0 uses
> (2.6.0-cdh5.7.0-SNAPSHOT).
>
> Thanks,
> Jay
>


Re: [External] Re: Spark 1.6.0 HiveContext NPE

2016-02-04 Thread Ted Yu
Jay:
It would be nice if you can patch Spark with below PR and give it a try.

Thanks

On Wed, Feb 3, 2016 at 6:03 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Created a pull request:
> https://github.com/apache/spark/pull/11066
>
> FYI
>
> On Wed, Feb 3, 2016 at 1:27 PM, Shipper, Jay [USA] <shipper_...@bah.com>
> wrote:
>
>> It was just renamed recently: https://github.com/apache/spark/pull/10981
>>
>> As SessionState is entirely managed by Spark’s code, it still seems like
>> this is a bug with Spark 1.6.0, and not with how our application is using
>> HiveContext.  But I’d feel more confident filing a bug if someone else
>> could confirm they’re having this issue with Spark 1.6.0.  Ideally, we
>> should also have some simple proof of concept that can be posted with the
>> bug.
>>
>> From: Ted Yu <yuzhih...@gmail.com>
>> Date: Wednesday, February 3, 2016 at 3:57 PM
>> To: Jay Shipper <shipper_...@bah.com>
>> Cc: "user@spark.apache.org" <user@spark.apache.org>
>> Subject: Re: [External] Re: Spark 1.6.0 HiveContext NPE
>>
>> In ClientWrapper.scala, the SessionState.get().getConf call might have
>> been executed ahead of SessionState.start(state) at line 194.
>>
>> This was the JIRA:
>>
>> [SPARK-10810] [SPARK-10902] [SQL] Improve session management in SQL
>>
>> In master branch, there is no more ClientWrapper.scala
>>
>> FYI
>>
>> On Wed, Feb 3, 2016 at 11:15 AM, Shipper, Jay [USA] <shipper_...@bah.com>
>> wrote:
>>
>>> One quick update on this: The NPE is not happening with Spark 1.5.2, so
>>> this problem seems specific to Spark 1.6.0.
>>>
>>> From: Jay Shipper <shipper_...@bah.com>
>>> Date: Wednesday, February 3, 2016 at 12:06 PM
>>> To: "user@spark.apache.org" <user@spark.apache.org>
>>> Subject: Re: [External] Re: Spark 1.6.0 HiveContext NPE
>>>
>>> Right, I could already tell that from the stack trace and looking at
>>> Spark’s code.  What I’m trying to determine is why that’s coming back as
>>> null now, just from upgrading Spark to 1.6.0.
>>>
>>> From: Ted Yu <yuzhih...@gmail.com>
>>> Date: Wednesday, February 3, 2016 at 12:04 PM
>>> To: Jay Shipper <shipper_...@bah.com>
>>> Cc: "user@spark.apache.org" <user@spark.apache.org>
>>> Subject: [External] Re: Spark 1.6.0 HiveContext NPE
>>>
>>> Looks like the NPE came from this line:
>>>   def conf: HiveConf = SessionState.get().getConf
>>>
>>> Meaning SessionState.get() returned null.
>>>
>>> On Wed, Feb 3, 2016 at 8:33 AM, Shipper, Jay [USA] <shipper_...@bah.com>
>>> wrote:
>>>
>>>> I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m
>>>> getting a NullPointerException from HiveContext.  It’s happening while it
>>>> tries to load some tables via JDBC from an external database (not Hive),
>>>> using context.read().jdbc():
>>>>
>>>> —
>>>> java.lang.NullPointerException
>>>> at
>>>> org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
>>>> at
>>>> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
>>>> at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
>>>> at
>>>> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
>>>> at
>>>> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
>>>> at
>>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>>>> at
>>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>>>> at scala.collection.immutable.List.foreach(List.scala:318)
>>>> at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>>>> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
>>>> at
>>>> org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
>>>> at
>>>> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
>>>> at
>>>> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
>>>> at
>>>> org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:457)
>>>> at
>>>> org.apache.spark.sql.hive.HiveContext.catalog$lzy

Re: [External] Re: Spark 1.6.0 HiveContext NPE

2016-02-03 Thread Shipper, Jay [USA]
It was just renamed recently: https://github.com/apache/spark/pull/10981

As SessionState is entirely managed by Spark’s code, it still seems like this 
is a bug with Spark 1.6.0, and not with how our application is using 
HiveContext.  But I’d feel more confident filing a bug if someone else could 
confirm they’re having this issue with Spark 1.6.0.  Ideally, we should also 
have some simple proof of concept that can be posted with the bug.

From: Ted Yu <yuzhih...@gmail.com<mailto:yuzhih...@gmail.com>>
Date: Wednesday, February 3, 2016 at 3:57 PM
To: Jay Shipper <shipper_...@bah.com<mailto:shipper_...@bah.com>>
Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Re: [External] Re: Spark 1.6.0 HiveContext NPE

In ClientWrapper.scala, the SessionState.get().getConf call might have been 
executed ahead of SessionState.start(state) at line 194.

This was the JIRA:

[SPARK-10810] [SPARK-10902] [SQL] Improve session management in SQL

In master branch, there is no more ClientWrapper.scala

FYI

On Wed, Feb 3, 2016 at 11:15 AM, Shipper, Jay [USA] 
<shipper_...@bah.com<mailto:shipper_...@bah.com>> wrote:
One quick update on this: The NPE is not happening with Spark 1.5.2, so this 
problem seems specific to Spark 1.6.0.

From: Jay Shipper <shipper_...@bah.com<mailto:shipper_...@bah.com>>
Date: Wednesday, February 3, 2016 at 12:06 PM
To: "user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Re: [External] Re: Spark 1.6.0 HiveContext NPE

Right, I could already tell that from the stack trace and looking at Spark’s 
code.  What I’m trying to determine is why that’s coming back as null now, just 
from upgrading Spark to 1.6.0.

From: Ted Yu <yuzhih...@gmail.com<mailto:yuzhih...@gmail.com>>
Date: Wednesday, February 3, 2016 at 12:04 PM
To: Jay Shipper <shipper_...@bah.com<mailto:shipper_...@bah.com>>
Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: [External] Re: Spark 1.6.0 HiveContext NPE

Looks like the NPE came from this line:
  def conf: HiveConf = SessionState.get().getConf

Meaning SessionState.get() returned null.

On Wed, Feb 3, 2016 at 8:33 AM, Shipper, Jay [USA] 
<shipper_...@bah.com<mailto:shipper_...@bah.com>> wrote:
I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m getting a 
NullPointerException from HiveContext.  It’s happening while it tries to load 
some tables via JDBC from an external database (not Hive), using 
context.read().jdbc():

—
java.lang.NullPointerException
at org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
at 
org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
at org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:457)
at 
org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457)
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456)
at org.apache.spark.sql.hive.HiveContext$$anon$3.(HiveContext.scala:473)
at 
org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473)
at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472)
at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
at org.apache.spark.sql.DataFrame.(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
—

Even though the application is not using Hive, HiveContext is used instead of 
SQLContext, for the additional functionality it provides.  There’s no 
hive-site.xml for the application, but this did not cause an issue for Spark 
1.4.1.

Does anyone have an i

Re: [External] Re: Spark 1.6.0 HiveContext NPE

2016-02-03 Thread Ted Yu
In ClientWrapper.scala, the SessionState.get().getConf call might have been
executed ahead of SessionState.start(state) at line 194.

This was the JIRA:

[SPARK-10810] [SPARK-10902] [SQL] Improve session management in SQL

In master branch, there is no more ClientWrapper.scala

FYI

On Wed, Feb 3, 2016 at 11:15 AM, Shipper, Jay [USA] <shipper_...@bah.com>
wrote:

> One quick update on this: The NPE is not happening with Spark 1.5.2, so
> this problem seems specific to Spark 1.6.0.
>
> From: Jay Shipper <shipper_...@bah.com>
> Date: Wednesday, February 3, 2016 at 12:06 PM
> To: "user@spark.apache.org" <user@spark.apache.org>
> Subject: Re: [External] Re: Spark 1.6.0 HiveContext NPE
>
> Right, I could already tell that from the stack trace and looking at
> Spark’s code.  What I’m trying to determine is why that’s coming back as
> null now, just from upgrading Spark to 1.6.0.
>
> From: Ted Yu <yuzhih...@gmail.com>
> Date: Wednesday, February 3, 2016 at 12:04 PM
> To: Jay Shipper <shipper_...@bah.com>
> Cc: "user@spark.apache.org" <user@spark.apache.org>
> Subject: [External] Re: Spark 1.6.0 HiveContext NPE
>
> Looks like the NPE came from this line:
>   def conf: HiveConf = SessionState.get().getConf
>
> Meaning SessionState.get() returned null.
>
> On Wed, Feb 3, 2016 at 8:33 AM, Shipper, Jay [USA] <shipper_...@bah.com>
> wrote:
>
>> I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m
>> getting a NullPointerException from HiveContext.  It’s happening while it
>> tries to load some tables via JDBC from an external database (not Hive),
>> using context.read().jdbc():
>>
>> —
>> java.lang.NullPointerException
>> at
>> org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
>> at
>> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
>> at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
>> at
>> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
>> at
>> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
>> at
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>> at
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>> at scala.collection.immutable.List.foreach(List.scala:318)
>> at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
>> at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
>> at
>> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
>> at
>> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
>> at
>> org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:457)
>> at
>> org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457)
>> at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456)
>> at
>> org.apache.spark.sql.hive.HiveContext$$anon$3.(HiveContext.scala:473)
>> at
>> org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473)
>> at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472)
>> at
>> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
>> at org.apache.spark.sql.DataFrame.(DataFrame.scala:133)
>> at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
>> at
>> org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442)
>> at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223)
>> at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
>> —
>>
>> Even though the application is not using Hive, HiveContext is used
>> instead of SQLContext, for the additional functionality it provides.
>> There’s no hive-site.xml for the application, but this did not cause an
>> issue for Spark 1.4.1.
>>
>> Does anyone have an idea about what’s changed from 1.4.1 to 1.6.0 that
>> could explain this NPE?  The only obvious change I’ve noticed for
>> HiveContext is that the default warehouse location is different (1.4.1 -
>> current directory, 1.6.0 - /user/hive/warehouse), but I verified that this
>> NPE happens even when /user/hive/warehouse exists and is readable/writeable
>> for the application.  In terms of changes to the application to work with
>> Spark 1.6.0, the only one that might be relevant to this issue is the
>> upgrade in the Hadoop dependencies to match what Spark 1.6.0 uses
>> (2.6.0-cdh5.7.0-SNAPSHOT).
>>
>> Thanks,
>> Jay
>>
>
>


Re: [External] Re: Spark 1.6.0 HiveContext NPE

2016-02-03 Thread Ted Yu
Create a pull request:
https://github.com/apache/spark/pull/11066

FYI

On Wed, Feb 3, 2016 at 1:27 PM, Shipper, Jay [USA] <shipper_...@bah.com>
wrote:

> It was just renamed recently: https://github.com/apache/spark/pull/10981
>
> As SessionState is entirely managed by Spark’s code, it still seems like
> this is a bug with Spark 1.6.0, and not with how our application is using
> HiveContext.  But I’d feel more confident filing a bug if someone else
> could confirm they’re having this issue with Spark 1.6.0.  Ideally, we
> should also have some simple proof of concept that can be posted with the
> bug.
>
> From: Ted Yu <yuzhih...@gmail.com>
> Date: Wednesday, February 3, 2016 at 3:57 PM
> To: Jay Shipper <shipper_...@bah.com>
> Cc: "user@spark.apache.org" <user@spark.apache.org>
> Subject: Re: [External] Re: Spark 1.6.0 HiveContext NPE
>
> In ClientWrapper.scala, the SessionState.get().getConf call might have
> been executed ahead of SessionState.start(state) at line 194.
>
> This was the JIRA:
>
> [SPARK-10810] [SPARK-10902] [SQL] Improve session management in SQL
>
> In master branch, there is no more ClientWrapper.scala
>
> FYI
>
> On Wed, Feb 3, 2016 at 11:15 AM, Shipper, Jay [USA] <shipper_...@bah.com>
> wrote:
>
>> One quick update on this: The NPE is not happening with Spark 1.5.2, so
>> this problem seems specific to Spark 1.6.0.
>>
>> From: Jay Shipper <shipper_...@bah.com>
>> Date: Wednesday, February 3, 2016 at 12:06 PM
>> To: "user@spark.apache.org" <user@spark.apache.org>
>> Subject: Re: [External] Re: Spark 1.6.0 HiveContext NPE
>>
>> Right, I could already tell that from the stack trace and looking at
>> Spark’s code.  What I’m trying to determine is why that’s coming back as
>> null now, just from upgrading Spark to 1.6.0.
>>
>> From: Ted Yu <yuzhih...@gmail.com>
>> Date: Wednesday, February 3, 2016 at 12:04 PM
>> To: Jay Shipper <shipper_...@bah.com>
>> Cc: "user@spark.apache.org" <user@spark.apache.org>
>> Subject: [External] Re: Spark 1.6.0 HiveContext NPE
>>
>> Looks like the NPE came from this line:
>>   def conf: HiveConf = SessionState.get().getConf
>>
>> Meaning SessionState.get() returned null.
>>
>> On Wed, Feb 3, 2016 at 8:33 AM, Shipper, Jay [USA] <shipper_...@bah.com>
>> wrote:
>>
>>> I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m
>>> getting a NullPointerException from HiveContext.  It’s happening while it
>>> tries to load some tables via JDBC from an external database (not Hive),
>>> using context.read().jdbc():
>>>
>>> —
>>> java.lang.NullPointerException
>>> at
>>> org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
>>> at
>>> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
>>> at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
>>> at
>>> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
>>> at
>>> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
>>> at
>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>>> at
>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>>> at scala.collection.immutable.List.foreach(List.scala:318)
>>> at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>>> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
>>> at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
>>> at
>>> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
>>> at
>>> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
>>> at
>>> org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:457)
>>> at
>>> org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457)
>>> at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456)
>>> at
>>> org.apache.spark.sql.hive.HiveContext$$anon$3.(HiveContext.scala:473)
>>> at
>>> org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473)
>>> at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472)
>>> at
>>> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
>>> at org.apache.sp

Spark 1.6.0 HiveContext NPE

2016-02-03 Thread Shipper, Jay [USA]
I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m getting a 
NullPointerException from HiveContext.  It’s happening while it tries to load 
some tables via JDBC from an external database (not Hive), using 
context.read().jdbc():

—
java.lang.NullPointerException
at org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
at 
org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
at org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:457)
at 
org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457)
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456)
at org.apache.spark.sql.hive.HiveContext$$anon$3.(HiveContext.scala:473)
at 
org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473)
at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472)
at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
at org.apache.spark.sql.DataFrame.(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
—

Even though the application is not using Hive, HiveContext is used instead of 
SQLContext, for the additional functionality it provides.  There’s no 
hive-site.xml for the application, but this did not cause an issue for Spark 
1.4.1.

Does anyone have an idea about what’s changed from 1.4.1 to 1.6.0 that could 
explain this NPE?  The only obvious change I’ve noticed for HiveContext is that 
the default warehouse location is different (1.4.1 - current directory, 1.6.0 - 
/user/hive/warehouse), but I verified that this NPE happens even when 
/user/hive/warehouse exists and is readable/writeable for the application.  In 
terms of changes to the application to work with Spark 1.6.0, the only one that 
might be relevant to this issue is the upgrade in the Hadoop dependencies to 
match what Spark 1.6.0 uses (2.6.0-cdh5.7.0-SNAPSHOT).

Thanks,
Jay


Re: [External] Re: Spark 1.6.0 HiveContext NPE

2016-02-03 Thread Shipper, Jay [USA]
Right, I could already tell that from the stack trace and looking at Spark’s 
code.  What I’m trying to determine is why that’s coming back as null now, just 
from upgrading Spark to 1.6.0.

From: Ted Yu <yuzhih...@gmail.com<mailto:yuzhih...@gmail.com>>
Date: Wednesday, February 3, 2016 at 12:04 PM
To: Jay Shipper <shipper_...@bah.com<mailto:shipper_...@bah.com>>
Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: [External] Re: Spark 1.6.0 HiveContext NPE

Looks like the NPE came from this line:
  def conf: HiveConf = SessionState.get().getConf

Meaning SessionState.get() returned null.

On Wed, Feb 3, 2016 at 8:33 AM, Shipper, Jay [USA] 
<shipper_...@bah.com<mailto:shipper_...@bah.com>> wrote:
I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m getting a 
NullPointerException from HiveContext.  It’s happening while it tries to load 
some tables via JDBC from an external database (not Hive), using 
context.read().jdbc():

—
java.lang.NullPointerException
at org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
at 
org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
at org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:457)
at 
org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457)
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456)
at org.apache.spark.sql.hive.HiveContext$$anon$3.(HiveContext.scala:473)
at 
org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473)
at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472)
at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
at org.apache.spark.sql.DataFrame.(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
—

Even though the application is not using Hive, HiveContext is used instead of 
SQLContext, for the additional functionality it provides.  There’s no 
hive-site.xml for the application, but this did not cause an issue for Spark 
1.4.1.

Does anyone have an idea about what’s changed from 1.4.1 to 1.6.0 that could 
explain this NPE?  The only obvious change I’ve noticed for HiveContext is that 
the default warehouse location is different (1.4.1 - current directory, 1.6.0 - 
/user/hive/warehouse), but I verified that this NPE happens even when 
/user/hive/warehouse exists and is readable/writeable for the application.  In 
terms of changes to the application to work with Spark 1.6.0, the only one that 
might be relevant to this issue is the upgrade in the Hadoop dependencies to 
match what Spark 1.6.0 uses (2.6.0-cdh5.7.0-SNAPSHOT).

Thanks,
Jay



Re: Spark 1.6.0 HiveContext NPE

2016-02-03 Thread Ted Yu
Looks like the NPE came from this line:
  def conf: HiveConf = SessionState.get().getConf

Meaning SessionState.get() returned null.

On Wed, Feb 3, 2016 at 8:33 AM, Shipper, Jay [USA] 
wrote:

> I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m
> getting a NullPointerException from HiveContext.  It’s happening while it
> tries to load some tables via JDBC from an external database (not Hive),
> using context.read().jdbc():
>
> —
> java.lang.NullPointerException
> at
> org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
> at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
> at
> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
> at
> org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at scala.collection.immutable.List.foreach(List.scala:318)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
> at
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
> at
> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
> at
> org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:457)
> at
> org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457)
> at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456)
> at
> org.apache.spark.sql.hive.HiveContext$$anon$3.(HiveContext.scala:473)
> at
> org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473)
> at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472)
> at
> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
> at org.apache.spark.sql.DataFrame.(DataFrame.scala:133)
> at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
> at
> org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442)
> at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223)
> at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
> —
>
> Even though the application is not using Hive, HiveContext is used instead
> of SQLContext, for the additional functionality it provides.  There’s no
> hive-site.xml for the application, but this did not cause an issue for
> Spark 1.4.1.
>
> Does anyone have an idea about what’s changed from 1.4.1 to 1.6.0 that
> could explain this NPE?  The only obvious change I’ve noticed for
> HiveContext is that the default warehouse location is different (1.4.1 -
> current directory, 1.6.0 - /user/hive/warehouse), but I verified that this
> NPE happens even when /user/hive/warehouse exists and is readable/writeable
> for the application.  In terms of changes to the application to work with
> Spark 1.6.0, the only one that might be relevant to this issue is the
> upgrade in the Hadoop dependencies to match what Spark 1.6.0 uses
> (2.6.0-cdh5.7.0-SNAPSHOT).
>
> Thanks,
> Jay
>


Re: [External] Re: Spark 1.6.0 HiveContext NPE

2016-02-03 Thread Shipper, Jay [USA]
One quick update on this: The NPE is not happening with Spark 1.5.2, so this 
problem seems specific to Spark 1.6.0.

From: Jay Shipper <shipper_...@bah.com<mailto:shipper_...@bah.com>>
Date: Wednesday, February 3, 2016 at 12:06 PM
To: "user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Re: [External] Re: Spark 1.6.0 HiveContext NPE

Right, I could already tell that from the stack trace and looking at Spark’s 
code.  What I’m trying to determine is why that’s coming back as null now, just 
from upgrading Spark to 1.6.0.

From: Ted Yu <yuzhih...@gmail.com<mailto:yuzhih...@gmail.com>>
Date: Wednesday, February 3, 2016 at 12:04 PM
To: Jay Shipper <shipper_...@bah.com<mailto:shipper_...@bah.com>>
Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: [External] Re: Spark 1.6.0 HiveContext NPE

Looks like the NPE came from this line:
  def conf: HiveConf = SessionState.get().getConf

Meaning SessionState.get() returned null.

On Wed, Feb 3, 2016 at 8:33 AM, Shipper, Jay [USA] 
<shipper_...@bah.com<mailto:shipper_...@bah.com>> wrote:
I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m getting a 
NullPointerException from HiveContext.  It’s happening while it tries to load 
some tables via JDBC from an external database (not Hive), using 
context.read().jdbc():

—
java.lang.NullPointerException
at org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205)
at 
org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552)
at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538)
at 
org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
at org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:457)
at 
org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457)
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456)
at org.apache.spark.sql.hive.HiveContext$$anon$3.(HiveContext.scala:473)
at 
org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473)
at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472)
at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
at org.apache.spark.sql.DataFrame.(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
—

Even though the application is not using Hive, HiveContext is used instead of 
SQLContext, for the additional functionality it provides.  There’s no 
hive-site.xml for the application, but this did not cause an issue for Spark 
1.4.1.

Does anyone have an idea about what’s changed from 1.4.1 to 1.6.0 that could 
explain this NPE?  The only obvious change I’ve noticed for HiveContext is that 
the default warehouse location is different (1.4.1 - current directory, 1.6.0 - 
/user/hive/warehouse), but I verified that this NPE happens even when 
/user/hive/warehouse exists and is readable/writeable for the application.  In 
terms of changes to the application to work with Spark 1.6.0, the only one that 
might be relevant to this issue is the upgrade in the Hadoop dependencies to 
match what Spark 1.6.0 uses (2.6.0-cdh5.7.0-SNAPSHOT).

Thanks,
Jay