Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

2015-11-06 Thread Zhan Zhang
Hi Jerry,

https://issues.apache.org/jira/browse/SPARK-11562 is created for the issue.

Thanks.

Zhan Zhang

On Nov 6, 2015, at 3:01 PM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:

Hi Zhan,

Thank you for providing a workaround!
I will try this out but I agree with Ted, there should be a better way to 
capture the exception and handle it by just initializing SQLContext instead of 
HiveContext. WARN the user that something is wrong with his hive setup.

Having spark.sql.hive.enabled false configuration would be lovely too. :)
Just an additional bonus is that it requires less memory if we don’t use 
HiveContext on the driver side (~100-200MB) from a rough observation.

Thanks and have a nice weekend!

Jerry


On Nov 6, 2015, at 5:53 PM, Ted Yu 
mailto:yuzhih...@gmail.com>> wrote:

I would suggest adding a config parameter that allows bypassing initialization 
of HiveContext in case of SQLException

Cheers

On Fri, Nov 6, 2015 at 2:50 PM, Zhan Zhang 
mailto:zzh...@hortonworks.com>> wrote:
Hi Jerry,

OK. Here is an ugly walk around.

Put a hive-site.xml under $SPARK_HOME/conf with invalid content. You will get a 
bunch of exceptions because hive context initialization failure, but you can 
initialize your SQLContext on your own.

scala>  val sqlContext = new org.apache.spark.sql.SQLContext(sc)
sqlContext: org.apache.spark.sql.SQLContext = 
org.apache.spark.sql.SQLContext@4a5cc2e8

scala> import sqlContext.implicits._
import sqlContext.implicits._


for example

HW11188:spark zzhang$ more conf/hive-site.xml


 

   

  hive.metastore.uris
thrift://zzhang-yarn11:9083

   

 
HW11188:spark zzhang$

By the way, I don’t know whether there is any caveat for this walk around.

Thanks.

Zhan Zhang





On Nov 6, 2015, at 2:40 PM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:

Hi Zhan,

I don’t use HiveContext features at all. I use mostly DataFrame API. It is 
sexier and much less typo. :)
Also, HiveContext requires metastore database setup (derby by default). The 
problem is that I cannot have 2 spark-shell sessions running at the same time 
in the same host (e.g. /home/jerry directory). It will give me an exception 
like below.

Since I don’t use HiveContext, I don’t see the need to maintain a database.

What is interesting is that pyspark shell is able to start more than 1 session 
at the same time. I wonder what pyspark has done better than spark-shell?

Best Regards,

Jerry

On Nov 6, 2015, at 5:28 PM, Zhan Zhang 
mailto:zzh...@hortonworks.com>> wrote:

If you assembly jar have hive jar included, the HiveContext will be used. 
Typically, HiveContext has more functionality than SQLContext. In what case you 
have to use SQLContext that cannot be done by HiveContext?

Thanks.

Zhan Zhang

On Nov 6, 2015, at 10:43 AM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:

What is interesting is that pyspark shell works fine with multiple session in 
the same host even though multiple HiveContext has been created. What does 
pyspark does differently in terms of starting up the shell?

On Nov 6, 2015, at 12:12 PM, Ted Yu 
mailto:yuzhih...@gmail.com>> wrote:

In SQLContext.scala :
// After we have populated SQLConf, we call setConf to populate other confs 
in the subclass
// (e.g. hiveconf in HiveContext).
properties.foreach {
  case (key, value) => setConf(key, value)
}

I don't see config of skipping the above call.

FYI

On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:
Hi spark users and developers,

Is it possible to disable HiveContext from being instantiated when using 
spark-shell? I got the following errors when I have more than one session 
starts. Since I don't use HiveContext, it would be great if I can have more 
than 1 spark-shell start at the same time.

Exception in thread "main" java.lang.RuntimeException: 
java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
toreClient
at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at 
org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
at 
org.apache.sp

Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

2015-11-06 Thread Zhan Zhang
I agree with minor change. Adding a config to provide the option to init 
SQLContext or HiveContext, with HiveContext as default instead of bypassing 
when hitting the Exception.

Thanks.

Zhan Zhang

On Nov 6, 2015, at 2:53 PM, Ted Yu 
mailto:yuzhih...@gmail.com>> wrote:

I would suggest adding a config parameter that allows bypassing initialization 
of HiveContext in case of SQLException

Cheers

On Fri, Nov 6, 2015 at 2:50 PM, Zhan Zhang 
mailto:zzh...@hortonworks.com>> wrote:
Hi Jerry,

OK. Here is an ugly walk around.

Put a hive-site.xml under $SPARK_HOME/conf with invalid content. You will get a 
bunch of exceptions because hive context initialization failure, but you can 
initialize your SQLContext on your own.

scala>  val sqlContext = new org.apache.spark.sql.SQLContext(sc)
sqlContext: org.apache.spark.sql.SQLContext = 
org.apache.spark.sql.SQLContext@4a5cc2e8

scala> import sqlContext.implicits._
import sqlContext.implicits._


for example

HW11188:spark zzhang$ more conf/hive-site.xml


 

   

  hive.metastore.uris
thrift://zzhang-yarn11:9083

   

 
HW11188:spark zzhang$

By the way, I don’t know whether there is any caveat for this walk around.

Thanks.

Zhan Zhang





On Nov 6, 2015, at 2:40 PM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:

Hi Zhan,

I don’t use HiveContext features at all. I use mostly DataFrame API. It is 
sexier and much less typo. :)
Also, HiveContext requires metastore database setup (derby by default). The 
problem is that I cannot have 2 spark-shell sessions running at the same time 
in the same host (e.g. /home/jerry directory). It will give me an exception 
like below.

Since I don’t use HiveContext, I don’t see the need to maintain a database.

What is interesting is that pyspark shell is able to start more than 1 session 
at the same time. I wonder what pyspark has done better than spark-shell?

Best Regards,

Jerry

On Nov 6, 2015, at 5:28 PM, Zhan Zhang 
mailto:zzh...@hortonworks.com>> wrote:

If you assembly jar have hive jar included, the HiveContext will be used. 
Typically, HiveContext has more functionality than SQLContext. In what case you 
have to use SQLContext that cannot be done by HiveContext?

Thanks.

Zhan Zhang

On Nov 6, 2015, at 10:43 AM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:

What is interesting is that pyspark shell works fine with multiple session in 
the same host even though multiple HiveContext has been created. What does 
pyspark does differently in terms of starting up the shell?

On Nov 6, 2015, at 12:12 PM, Ted Yu 
mailto:yuzhih...@gmail.com>> wrote:

In SQLContext.scala :
// After we have populated SQLConf, we call setConf to populate other confs 
in the subclass
// (e.g. hiveconf in HiveContext).
properties.foreach {
  case (key, value) => setConf(key, value)
}

I don't see config of skipping the above call.

FYI

On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:
Hi spark users and developers,

Is it possible to disable HiveContext from being instantiated when using 
spark-shell? I got the following errors when I have more than one session 
starts. Since I don't use HiveContext, it would be great if I can have more 
than 1 spark-shell start at the same time.

Exception in thread "main" java.lang.RuntimeException: 
java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
toreClient
at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at 
org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
at 
org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235)
at 
org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at org.apache.spark.sql.SQLContext.(SQLContext.scala:234)
at org.apac

Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

2015-11-06 Thread Jerry Lam
Hi Zhan,

Thank you for providing a workaround! 
I will try this out but I agree with Ted, there should be a better way to 
capture the exception and handle it by just initializing SQLContext instead of 
HiveContext. WARN the user that something is wrong with his hive setup. 

Having spark.sql.hive.enabled false configuration would be lovely too. :)
Just an additional bonus is that it requires less memory if we don’t use 
HiveContext on the driver side (~100-200MB) from a rough observation. 

Thanks and have a nice weekend!

Jerry


> On Nov 6, 2015, at 5:53 PM, Ted Yu  wrote:
> 
> I would suggest adding a config parameter that allows bypassing 
> initialization of HiveContext in case of SQLException
> 
> Cheers
> 
> On Fri, Nov 6, 2015 at 2:50 PM, Zhan Zhang  > wrote:
> Hi Jerry,
> 
> OK. Here is an ugly walk around.
> 
> Put a hive-site.xml under $SPARK_HOME/conf with invalid content. You will get 
> a bunch of exceptions because hive context initialization failure, but you 
> can initialize your SQLContext on your own.
> 
> scala>  val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> sqlContext: org.apache.spark.sql.SQLContext = 
> org.apache.spark.sql.SQLContext@4a5cc2e8
> 
> scala> import sqlContext.implicits._
> import sqlContext.implicits._
> 
> 
> for example
> 
> HW11188:spark zzhang$ more conf/hive-site.xml
> 
> 
>  
> 
>
> 
>   hive.metastore.uris
> thrift://zzhang-yarn11:9083 <>
> 
>
> 
>  
> HW11188:spark zzhang$
> 
> By the way, I don’t know whether there is any caveat for this walk around.
> 
> Thanks.
> 
> Zhan Zhang
> 
> 
> 
> 
> 
> On Nov 6, 2015, at 2:40 PM, Jerry Lam  > wrote:
> 
>> Hi Zhan,
>> 
>> I don’t use HiveContext features at all. I use mostly DataFrame API. It is 
>> sexier and much less typo. :)
>> Also, HiveContext requires metastore database setup (derby by default). The 
>> problem is that I cannot have 2 spark-shell sessions running at the same 
>> time in the same host (e.g. /home/jerry directory). It will give me an 
>> exception like below. 
>> 
>> Since I don’t use HiveContext, I don’t see the need to maintain a database. 
>> 
>> What is interesting is that pyspark shell is able to start more than 1 
>> session at the same time. I wonder what pyspark has done better than 
>> spark-shell?
>> 
>> Best Regards,
>> 
>> Jerry
>> 
>>> On Nov 6, 2015, at 5:28 PM, Zhan Zhang >> > wrote:
>>> 
>>> If you assembly jar have hive jar included, the HiveContext will be used. 
>>> Typically, HiveContext has more functionality than SQLContext. In what case 
>>> you have to use SQLContext that cannot be done by HiveContext?
>>> 
>>> Thanks.
>>> 
>>> Zhan Zhang
>>> 
>>> On Nov 6, 2015, at 10:43 AM, Jerry Lam >> > wrote:
>>> 
 What is interesting is that pyspark shell works fine with multiple session 
 in the same host even though multiple HiveContext has been created. What 
 does pyspark does differently in terms of starting up the shell?
 
> On Nov 6, 2015, at 12:12 PM, Ted Yu  > wrote:
> 
> In SQLContext.scala :
> // After we have populated SQLConf, we call setConf to populate other 
> confs in the subclass
> // (e.g. hiveconf in HiveContext).
> properties.foreach {
>   case (key, value) => setConf(key, value)
> }
> 
> I don't see config of skipping the above call.
> 
> FYI
> 
> On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam  > wrote:
> Hi spark users and developers,
> 
> Is it possible to disable HiveContext from being instantiated when using 
> spark-shell? I got the following errors when I have more than one session 
> starts. Since I don't use HiveContext, it would be great if I can have 
> more than 1 spark-shell start at the same time. 
> 
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
> toreClient
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
> at 
> org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at 
> org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
> at 
> org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:17

Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

2015-11-06 Thread Ted Yu
I would suggest adding a config parameter that allows bypassing
initialization of HiveContext in case of SQLException

Cheers

On Fri, Nov 6, 2015 at 2:50 PM, Zhan Zhang  wrote:

> Hi Jerry,
>
> OK. Here is an ugly walk around.
>
> Put a hive-site.xml under $SPARK_HOME/conf with invalid content. You will
> get a bunch of exceptions because hive context initialization failure, but
> you can initialize your SQLContext on your own.
>
> scala>  val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> sqlContext: org.apache.spark.sql.SQLContext =
> org.apache.spark.sql.SQLContext@4a5cc2e8
>
> scala> import sqlContext.implicits._
> import sqlContext.implicits._
>
>
> for example
>
> HW11188:spark zzhang$ more conf/hive-site.xml
> 
> 
>  
>
>
>
>   hive.metastore.uris
> thrift://zzhang-yarn11:9083
>
>
>
>  
> HW11188:spark zzhang$
>
> By the way, I don’t know whether there is any caveat for this walk around.
>
> Thanks.
>
> Zhan Zhang
>
>
>
>
>
> On Nov 6, 2015, at 2:40 PM, Jerry Lam  wrote:
>
> Hi Zhan,
>
> I don’t use HiveContext features at all. I use mostly DataFrame API. It is
> sexier and much less typo. :)
> Also, HiveContext requires metastore database setup (derby by default).
> The problem is that I cannot have 2 spark-shell sessions running at the
> same time in the same host (e.g. /home/jerry directory). It will give me an
> exception like below.
>
> Since I don’t use HiveContext, I don’t see the need to maintain a
> database.
>
> What is interesting is that pyspark shell is able to start more than 1
> session at the same time. I wonder what pyspark has done better than
> spark-shell?
>
> Best Regards,
>
> Jerry
>
> On Nov 6, 2015, at 5:28 PM, Zhan Zhang  wrote:
>
> If you assembly jar have hive jar included, the HiveContext will be used.
> Typically, HiveContext has more functionality than SQLContext. In what case
> you have to use SQLContext that cannot be done by HiveContext?
>
> Thanks.
>
> Zhan Zhang
>
> On Nov 6, 2015, at 10:43 AM, Jerry Lam  wrote:
>
> What is interesting is that pyspark shell works fine with multiple session
> in the same host even though multiple HiveContext has been created. What
> does pyspark does differently in terms of starting up the shell?
>
> On Nov 6, 2015, at 12:12 PM, Ted Yu  wrote:
>
> In SQLContext.scala :
> // After we have populated SQLConf, we call setConf to populate other
> confs in the subclass
> // (e.g. hiveconf in HiveContext).
> properties.foreach {
>   case (key, value) => setConf(key, value)
> }
>
> I don't see config of skipping the above call.
>
> FYI
>
> On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam  wrote:
>
>> Hi spark users and developers,
>>
>> Is it possible to disable HiveContext from being instantiated when using
>> spark-shell? I got the following errors when I have more than one session
>> starts. Since I don't use HiveContext, it would be great if I can have more
>> than 1 spark-shell start at the same time.
>>
>> Exception in thread "main" java.lang.RuntimeException:
>> java.lang.RuntimeException: Unable to instantiate
>> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
>> toreClient
>> at
>> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
>> at
>> org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>> at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>> at
>> org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
>> at
>> org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)
>> at
>> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
>> at
>> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
>> at
>> org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
>> at
>> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235)
>> at
>> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>> at org.apache.spark.sql.SQLContext.(SQLContext.scala:234)
>> at
>> org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>> at
>> sun.reflect.NativeConstructorAccessorImp

Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

2015-11-06 Thread Zhan Zhang
Hi Jerry,

OK. Here is an ugly walk around.

Put a hive-site.xml under $SPARK_HOME/conf with invalid content. You will get a 
bunch of exceptions because hive context initialization failure, but you can 
initialize your SQLContext on your own.

scala>  val sqlContext = new org.apache.spark.sql.SQLContext(sc)
sqlContext: org.apache.spark.sql.SQLContext = 
org.apache.spark.sql.SQLContext@4a5cc2e8

scala> import sqlContext.implicits._
import sqlContext.implicits._


for example

HW11188:spark zzhang$ more conf/hive-site.xml


 

   

  hive.metastore.uris
thrift://zzhang-yarn11:9083

   

 
HW11188:spark zzhang$

By the way, I don’t know whether there is any caveat for this walk around.

Thanks.

Zhan Zhang





On Nov 6, 2015, at 2:40 PM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:

Hi Zhan,

I don’t use HiveContext features at all. I use mostly DataFrame API. It is 
sexier and much less typo. :)
Also, HiveContext requires metastore database setup (derby by default). The 
problem is that I cannot have 2 spark-shell sessions running at the same time 
in the same host (e.g. /home/jerry directory). It will give me an exception 
like below.

Since I don’t use HiveContext, I don’t see the need to maintain a database.

What is interesting is that pyspark shell is able to start more than 1 session 
at the same time. I wonder what pyspark has done better than spark-shell?

Best Regards,

Jerry

On Nov 6, 2015, at 5:28 PM, Zhan Zhang 
mailto:zzh...@hortonworks.com>> wrote:

If you assembly jar have hive jar included, the HiveContext will be used. 
Typically, HiveContext has more functionality than SQLContext. In what case you 
have to use SQLContext that cannot be done by HiveContext?

Thanks.

Zhan Zhang

On Nov 6, 2015, at 10:43 AM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:

What is interesting is that pyspark shell works fine with multiple session in 
the same host even though multiple HiveContext has been created. What does 
pyspark does differently in terms of starting up the shell?

On Nov 6, 2015, at 12:12 PM, Ted Yu 
mailto:yuzhih...@gmail.com>> wrote:

In SQLContext.scala :
// After we have populated SQLConf, we call setConf to populate other confs 
in the subclass
// (e.g. hiveconf in HiveContext).
properties.foreach {
  case (key, value) => setConf(key, value)
}

I don't see config of skipping the above call.

FYI

On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:
Hi spark users and developers,

Is it possible to disable HiveContext from being instantiated when using 
spark-shell? I got the following errors when I have more than one session 
starts. Since I don't use HiveContext, it would be great if I can have more 
than 1 spark-shell start at the same time.

Exception in thread "main" java.lang.RuntimeException: 
java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
toreClient
at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at 
org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
at 
org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235)
at 
org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at org.apache.spark.sql.SQLContext.(SQLContext.scala:234)
at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.spark.repl.SparkILoop.createSQLCont

Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

2015-11-06 Thread Jerry Lam
Hi Zhan,

I don’t use HiveContext features at all. I use mostly DataFrame API. It is 
sexier and much less typo. :)
Also, HiveContext requires metastore database setup (derby by default). The 
problem is that I cannot have 2 spark-shell sessions running at the same time 
in the same host (e.g. /home/jerry directory). It will give me an exception 
like below. 

Since I don’t use HiveContext, I don’t see the need to maintain a database. 

What is interesting is that pyspark shell is able to start more than 1 session 
at the same time. I wonder what pyspark has done better than spark-shell?

Best Regards,

Jerry

> On Nov 6, 2015, at 5:28 PM, Zhan Zhang  wrote:
> 
> If you assembly jar have hive jar included, the HiveContext will be used. 
> Typically, HiveContext has more functionality than SQLContext. In what case 
> you have to use SQLContext that cannot be done by HiveContext?
> 
> Thanks.
> 
> Zhan Zhang
> 
> On Nov 6, 2015, at 10:43 AM, Jerry Lam  > wrote:
> 
>> What is interesting is that pyspark shell works fine with multiple session 
>> in the same host even though multiple HiveContext has been created. What 
>> does pyspark does differently in terms of starting up the shell?
>> 
>>> On Nov 6, 2015, at 12:12 PM, Ted Yu >> > wrote:
>>> 
>>> In SQLContext.scala :
>>> // After we have populated SQLConf, we call setConf to populate other 
>>> confs in the subclass
>>> // (e.g. hiveconf in HiveContext).
>>> properties.foreach {
>>>   case (key, value) => setConf(key, value)
>>> }
>>> 
>>> I don't see config of skipping the above call.
>>> 
>>> FYI
>>> 
>>> On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam >> > wrote:
>>> Hi spark users and developers,
>>> 
>>> Is it possible to disable HiveContext from being instantiated when using 
>>> spark-shell? I got the following errors when I have more than one session 
>>> starts. Since I don't use HiveContext, it would be great if I can have more 
>>> than 1 spark-shell start at the same time. 
>>> 
>>> Exception in thread "main" java.lang.RuntimeException: 
>>> java.lang.RuntimeException: Unable to instantiate 
>>> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
>>> toreClient
>>> at 
>>> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
>>> at 
>>> org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)
>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
>>> Method)
>>> at 
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>> at 
>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>> at 
>>> org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
>>> at 
>>> org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)
>>> at 
>>> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
>>> at 
>>> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
>>> at 
>>> org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
>>> at 
>>> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235)
>>> at 
>>> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234)
>>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>> at 
>>> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>>> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>>> at org.apache.spark.sql.SQLContext.(SQLContext.scala:234)
>>> at 
>>> org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
>>> Method)
>>> at 
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>> at 
>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>> at 
>>> org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
>>> at 
>>> org.apache.spark.repl.SparkILoopExt.importSpark(SparkILoopExt.scala:154)
>>> at 
>>> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply$mcZ$sp(SparkILoopExt.scala:127)
>>> at 
>>> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)
>>> at 
>>> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)
>>> 
>>> Best Regards,
>>> 
>>> Jerry
>>> 
>> 
> 



Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

2015-11-06 Thread Zhan Zhang
If you assembly jar have hive jar included, the HiveContext will be used. 
Typically, HiveContext has more functionality than SQLContext. In what case you 
have to use SQLContext that cannot be done by HiveContext?

Thanks.

Zhan Zhang

On Nov 6, 2015, at 10:43 AM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:

What is interesting is that pyspark shell works fine with multiple session in 
the same host even though multiple HiveContext has been created. What does 
pyspark does differently in terms of starting up the shell?

On Nov 6, 2015, at 12:12 PM, Ted Yu 
mailto:yuzhih...@gmail.com>> wrote:

In SQLContext.scala :
// After we have populated SQLConf, we call setConf to populate other confs 
in the subclass
// (e.g. hiveconf in HiveContext).
properties.foreach {
  case (key, value) => setConf(key, value)
}

I don't see config of skipping the above call.

FYI

On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam 
mailto:chiling...@gmail.com>> wrote:
Hi spark users and developers,

Is it possible to disable HiveContext from being instantiated when using 
spark-shell? I got the following errors when I have more than one session 
starts. Since I don't use HiveContext, it would be great if I can have more 
than 1 spark-shell start at the same time.

Exception in thread "main" java.lang.RuntimeException: 
java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
toreClient
at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at 
org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
at 
org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
at 
org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
at 
org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235)
at 
org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at org.apache.spark.sql.SQLContext.(SQLContext.scala:234)
at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
at 
org.apache.spark.repl.SparkILoopExt.importSpark(SparkILoopExt.scala:154)
at 
org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply$mcZ$sp(SparkILoopExt.scala:127)
at 
org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)
at 
org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)

Best Regards,

Jerry





Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

2015-11-06 Thread Jerry Lam
What is interesting is that pyspark shell works fine with multiple session in 
the same host even though multiple HiveContext has been created. What does 
pyspark does differently in terms of starting up the shell?

> On Nov 6, 2015, at 12:12 PM, Ted Yu  wrote:
> 
> In SQLContext.scala :
> // After we have populated SQLConf, we call setConf to populate other 
> confs in the subclass
> // (e.g. hiveconf in HiveContext).
> properties.foreach {
>   case (key, value) => setConf(key, value)
> }
> 
> I don't see config of skipping the above call.
> 
> FYI
> 
> On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam  > wrote:
> Hi spark users and developers,
> 
> Is it possible to disable HiveContext from being instantiated when using 
> spark-shell? I got the following errors when I have more than one session 
> starts. Since I don't use HiveContext, it would be great if I can have more 
> than 1 spark-shell start at the same time. 
> 
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
> toreClient
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
> at 
> org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at 
> org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
> at 
> org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)
> at 
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
> at 
> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
> at 
> org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
> at 
> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235)
> at 
> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at org.apache.spark.sql.SQLContext.(SQLContext.scala:234)
> at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at 
> org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
> at 
> org.apache.spark.repl.SparkILoopExt.importSpark(SparkILoopExt.scala:154)
> at 
> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply$mcZ$sp(SparkILoopExt.scala:127)
> at 
> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)
> at 
> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)
> 
> Best Regards,
> 
> Jerry
> 



Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

2015-11-06 Thread Jerry Lam
Hi Ted,

I was trying to set spark.sql.dialect to sql as to specify I only need 
“SQLContext” not HiveContext. It didn’t work. It still instantiate HiveContext. 
Since I don’t use HiveContext and I don’t want to start a mysql database 
because I want to have more than 1 session of spark-shell simultaneously. Is 
there an easy way to get around it? More exception here:

Caused by: java.sql.SQLException: Unable to open a test connection to the given 
database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, 
username = APP. T
erminating connection pool (set lazyInit to true if you expect to start your 
database after your app). Original Exception: --^M
java.sql.SQLException: Failed to start database 'metastore_db' with class 
loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@53a39109, 
see the next exc
eption for details.
at 
org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown 
Source)
at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection40.(Unknown Source)
at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
at org.apache.derby.jdbc.Driver20.connect(Unknown Source)
at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
at java.sql.DriverManager.getConnection(DriverManager.java:571)

Best Regards,

Jerry

> On Nov 6, 2015, at 12:12 PM, Ted Yu  wrote:
> 
> In SQLContext.scala :
> // After we have populated SQLConf, we call setConf to populate other 
> confs in the subclass
> // (e.g. hiveconf in HiveContext).
> properties.foreach {
>   case (key, value) => setConf(key, value)
> }
> 
> I don't see config of skipping the above call.
> 
> FYI
> 
> On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam  > wrote:
> Hi spark users and developers,
> 
> Is it possible to disable HiveContext from being instantiated when using 
> spark-shell? I got the following errors when I have more than one session 
> starts. Since I don't use HiveContext, it would be great if I can have more 
> than 1 spark-shell start at the same time. 
> 
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
> toreClient
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
> at 
> org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at 
> org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
> at 
> org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)
> at 
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
> at 
> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
> at 
> org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
> at 
> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235)
> at 
> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at org.apache.spark.sql.SQLContext.(SQLContext.scala:234)
> at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at 
> org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
> at 
> org.apache.spark.repl.SparkILoopExt.importSpark(SparkILoopExt.scala:154)
> at 
> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.app

Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

2015-11-06 Thread Ted Yu
In SQLContext.scala :
// After we have populated SQLConf, we call setConf to populate other
confs in the subclass
// (e.g. hiveconf in HiveContext).
properties.foreach {
  case (key, value) => setConf(key, value)
}

I don't see config of skipping the above call.

FYI

On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam  wrote:

> Hi spark users and developers,
>
> Is it possible to disable HiveContext from being instantiated when using
> spark-shell? I got the following errors when I have more than one session
> starts. Since I don't use HiveContext, it would be great if I can have more
> than 1 spark-shell start at the same time.
>
> Exception in thread "main" java.lang.RuntimeException:
> java.lang.RuntimeException: Unable to instantiate
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
> toreClient
> at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
> at
> org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
> at
> org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)
> at
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
> at
> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
> at
> org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
> at
> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235)
> at
> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at org.apache.spark.sql.SQLContext.(SQLContext.scala:234)
> at
> org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
> at
> org.apache.spark.repl.SparkILoopExt.importSpark(SparkILoopExt.scala:154)
> at
> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply$mcZ$sp(SparkILoopExt.scala:127)
> at
> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)
> at
> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)
>
> Best Regards,
>
> Jerry
>


[Spark-SQL]: Disable HiveContext from instantiating in spark-shell

2015-11-06 Thread Jerry Lam
Hi spark users and developers,

Is it possible to disable HiveContext from being instantiated when using
spark-shell? I got the following errors when I have more than one session
starts. Since I don't use HiveContext, it would be great if I can have more
than 1 spark-shell start at the same time.

Exception in thread "main" java.lang.RuntimeException:
java.lang.RuntimeException: Unable to instantiate
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
toreClient
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at
org.apache.spark.sql.hive.client.ClientWrapper.(ClientWrapper.scala:171)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
at
org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179)
at
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
at
org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
at
org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
at
org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235)
at
org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at
scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at org.apache.spark.sql.SQLContext.(SQLContext.scala:234)
at
org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
at
org.apache.spark.repl.SparkILoopExt.importSpark(SparkILoopExt.scala:154)
at
org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply$mcZ$sp(SparkILoopExt.scala:127)
at
org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)
at
org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)

Best Regards,

Jerry