Sharing HiveContext in Spark JobServer / getOrCreate
Hi I am using a shared sparkContext for all of my Spark jobs. Some of the jobs use HiveContext, but there isn't a getOrCreate method on HiveContext which will allow reuse of an existing HiveContext. Such a method exists on SQLContext only (def getOrCreate(sparkContext: SparkContext): SQLContext). Is there any reason that a HiveContext cannot be shared amongst multiple threads within the same Spark driver process? In addition I cannot seem to be able to cast a HiveContext to a SQLContext, but this works fine in the spark shell, I am doing something wrong here? scala> sqlContext res19: org.apache.spark.sql.SQLContext = org.apache.spark.sql.hive.HiveContext@383b3357 scala> import org.apache.spark.sql.SQLContext import org.apache.spark.sql.SQLContext scala> SQLContext.getOrCreate(sc) res18: org.apache.spark.sql.SQLContext = org.apache.spark.sql.hive.HiveContext@383b3357 Regards Deenar
Re: Sharing HiveContext in Spark JobServer / getOrCreate
Have you noticed the following method of HiveContext ? * Returns a new HiveContext as new session, which will have separated SQLConf, UDF/UDAF, * temporary tables and SessionState, but sharing the same CacheManager, IsolatedClientLoader * and Hive client (both of execution and metadata) with existing HiveContext. */ override def newSession(): HiveContext = { Cheers On Mon, Jan 25, 2016 at 7:22 AM, Deenar Toraskarwrote: > Hi > > I am using a shared sparkContext for all of my Spark jobs. Some of the > jobs use HiveContext, but there isn't a getOrCreate method on HiveContext > which will allow reuse of an existing HiveContext. Such a method exists on > SQLContext only (def getOrCreate(sparkContext: SparkContext): SQLContext). > > Is there any reason that a HiveContext cannot be shared amongst multiple > threads within the same Spark driver process? > > In addition I cannot seem to be able to cast a HiveContext to a > SQLContext, but this works fine in the spark shell, I am doing something > wrong here? > > scala> sqlContext > > res19: org.apache.spark.sql.SQLContext = > org.apache.spark.sql.hive.HiveContext@383b3357 > > scala> import org.apache.spark.sql.SQLContext > > import org.apache.spark.sql.SQLContext > > scala> SQLContext.getOrCreate(sc) > > res18: org.apache.spark.sql.SQLContext = > org.apache.spark.sql.hive.HiveContext@383b3357 > > > > Regards > Deenar >
Re: Sharing HiveContext in Spark JobServer / getOrCreate
On 25 January 2016 at 21:09, Deenar Toraskar < deenar.toras...@thinkreactive.co.uk> wrote: > No I hadn't. This is useful, but in some cases we do want to share the > same temporary tables between jobs so really wanted a getOrCreate > equivalent on HIveContext. > > Deenar > > > > On 25 January 2016 at 18:10, Ted Yuwrote: > >> Have you noticed the following method of HiveContext ? >> >>* Returns a new HiveContext as new session, which will have separated >> SQLConf, UDF/UDAF, >>* temporary tables and SessionState, but sharing the same >> CacheManager, IsolatedClientLoader >>* and Hive client (both of execution and metadata) with existing >> HiveContext. >>*/ >> override def newSession(): HiveContext = { >> >> Cheers >> >> On Mon, Jan 25, 2016 at 7:22 AM, Deenar Toraskar < >> deenar.toras...@gmail.com> wrote: >> >>> Hi >>> >>> I am using a shared sparkContext for all of my Spark jobs. Some of the >>> jobs use HiveContext, but there isn't a getOrCreate method on HiveContext >>> which will allow reuse of an existing HiveContext. Such a method exists on >>> SQLContext only (def getOrCreate(sparkContext: SparkContext): >>> SQLContext). >>> >>> Is there any reason that a HiveContext cannot be shared amongst multiple >>> threads within the same Spark driver process? >>> >>> In addition I cannot seem to be able to cast a HiveContext to a >>> SQLContext, but this works fine in the spark shell, I am doing something >>> wrong here? >>> >>> scala> sqlContext >>> >>> res19: org.apache.spark.sql.SQLContext = >>> org.apache.spark.sql.hive.HiveContext@383b3357 >>> >>> scala> import org.apache.spark.sql.SQLContext >>> >>> import org.apache.spark.sql.SQLContext >>> >>> scala> SQLContext.getOrCreate(sc) >>> >>> res18: org.apache.spark.sql.SQLContext = >>> org.apache.spark.sql.hive.HiveContext@383b3357 >>> >>> >>> >>> Regards >>> Deenar >>> >> >> >