Hi Zhan, I don’t use HiveContext features at all. I use mostly DataFrame API. It is sexier and much less typo. :) Also, HiveContext requires metastore database setup (derby by default). The problem is that I cannot have 2 spark-shell sessions running at the same time in the same host (e.g. /home/jerry directory). It will give me an exception like below.
Since I don’t use HiveContext, I don’t see the need to maintain a database. What is interesting is that pyspark shell is able to start more than 1 session at the same time. I wonder what pyspark has done better than spark-shell? Best Regards, Jerry > On Nov 6, 2015, at 5:28 PM, Zhan Zhang <zzh...@hortonworks.com> wrote: > > If you assembly jar have hive jar included, the HiveContext will be used. > Typically, HiveContext has more functionality than SQLContext. In what case > you have to use SQLContext that cannot be done by HiveContext? > > Thanks. > > Zhan Zhang > > On Nov 6, 2015, at 10:43 AM, Jerry Lam <chiling...@gmail.com > <mailto:chiling...@gmail.com>> wrote: > >> What is interesting is that pyspark shell works fine with multiple session >> in the same host even though multiple HiveContext has been created. What >> does pyspark does differently in terms of starting up the shell? >> >>> On Nov 6, 2015, at 12:12 PM, Ted Yu <yuzhih...@gmail.com >>> <mailto:yuzhih...@gmail.com>> wrote: >>> >>> In SQLContext.scala : >>> // After we have populated SQLConf, we call setConf to populate other >>> confs in the subclass >>> // (e.g. hiveconf in HiveContext). >>> properties.foreach { >>> case (key, value) => setConf(key, value) >>> } >>> >>> I don't see config of skipping the above call. >>> >>> FYI >>> >>> On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam <chiling...@gmail.com >>> <mailto:chiling...@gmail.com>> wrote: >>> Hi spark users and developers, >>> >>> Is it possible to disable HiveContext from being instantiated when using >>> spark-shell? I got the following errors when I have more than one session >>> starts. Since I don't use HiveContext, it would be great if I can have more >>> than 1 spark-shell start at the same time. >>> >>> Exception in thread "main" java.lang.RuntimeException: >>> java.lang.RuntimeException: Unable to instantiate >>> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS >>> toreClient >>> at >>> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) >>> at >>> org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:171) >>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >>> Method) >>> at >>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >>> at >>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) >>> at >>> org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183) >>> at >>> org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179) >>> at >>> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226) >>> at >>> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185) >>> at >>> org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392) >>> at >>> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235) >>> at >>> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234) >>> at scala.collection.Iterator$class.foreach(Iterator.scala:727) >>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >>> at >>> scala.collection.IterableLike$class.foreach(IterableLike.scala:72) >>> at scala.collection.AbstractIterable.foreach(Iterable.scala:54) >>> at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:234) >>> at >>> org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:72) >>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >>> Method) >>> at >>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >>> at >>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >>> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) >>> at >>> org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028) >>> at >>> org.apache.spark.repl.SparkILoopExt.importSpark(SparkILoopExt.scala:154) >>> at >>> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply$mcZ$sp(SparkILoopExt.scala:127) >>> at >>> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113) >>> at >>> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113) >>> >>> Best Regards, >>> >>> Jerry >>> >> >