To use HiveContext witch is basically an sql api within Spark without proper hive set up does not make sense. It is a super set of Spark SQLContext
In addition simple things like registerTempTable may not work. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 26 May 2016 at 13:01, Silvio Fiorito <silvio.fior...@granturing.com> wrote: > Hi Gerard, > > > > I’ve never had an issue using the HiveContext without a hive-site.xml > configured. However, one issue you may have is if multiple users are > starting the HiveContext from the same path, they’ll all be trying to store > the default Derby metastore in the same location. Also, if you want them to > be able to persist permanent table metadata for SparkSQL then you’ll want > to set up a true metastore. > > > > The other thing it could be is Hive dependency collisions from the > classpath, but that shouldn’t be an issue since you said it’s standalone > (not a Hadoop distro right?). > > > > Thanks, > > Silvio > > > > *From: *Gerard Maas <gerard.m...@gmail.com> > *Date: *Thursday, May 26, 2016 at 5:28 AM > *To: *spark users <user@spark.apache.org> > *Subject: *HiveContext standalone => without a Hive metastore > > > > Hi, > > > > I'm helping some folks setting up an analytics cluster with Spark. > > They want to use the HiveContext to enable the Window functions on > DataFrames(*) but they don't have any Hive installation, nor they need one > at the moment (if not necessary for this feature) > > > > When we try to create a Hive context, we get the following error: > > > > > val sqlContext = new org.apache.spark.sql.hive.HiveContext(sparkContext) > > java.lang.RuntimeException: java.lang.RuntimeException: Unable to > instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient > > at > org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) > > > > Is my HiveContext failing b/c it wants to connect to an unconfigured Hive > Metastore? > > > > Is there a way to instantiate a HiveContext for the sake of Window > support without an underlying Hive deployment? > > > > The docs are explicit in saying that that is should be the case: [1] > > > > "To use a HiveContext, you do not need to have an existing Hive setup, > and all of the data sources available to aSQLContext are still available. > HiveContext is only packaged separately to avoid including all of Hive’s > dependencies in the default Spark build." > > > > So what is the right way to address this issue? How to instantiate a > HiveContext with spark running on a HDFS cluster without Hive deployed? > > > > > > Thanks a lot! > > > > -Gerard. > > > > (*) The need for a HiveContext to use Window functions is pretty obscure. > The only documentation of this seems to be a runtime exception: > "org.apache.spark.sql.AnalysisException: > Could not resolve window function 'max'. Note that, using window functions > currently requires a HiveContext;" > > > > [1] > http://spark.apache.org/docs/latest/sql-programming-guide.html#getting-started >