I am using Spark (standalone) to run queries (from a remote client) against data in tables that are already defined/loaded in Hive.
I have started metastore service in Hive successfully, and by putting hive-site.xml, with proper metastore.uri, in $SPARK_HOME/conf directory, I tried to share its config with spark. When I start spark-shell, it gives me a default sqlContext, and I can use that to access my Hive's tables with no problem. But once I submit a similar query via Spark application through 'spark-submit', it does not see the tables and it seems it does not pick hive-site.xml which is under conf directory in Spark's home. I tried to use '--files' argument with spark-submit to pass "hive-site.xml' to the workers, but it did not change anything. Here is how I try to run the application: $SPARK_HOME/bin/spark-submit --class "SimpleClient" --master spark://my-spark-master:7077 --files=$SPARK_HOME/conf/hive-site.xml simple-sql-client-1.0.jar Here is the simple example code that I try to run (in Java): SparkConf conf = new SparkConf().setAppName("Simple SQL Client"); JavaSparkContext sc = new JavaSparkContext(conf); SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc); DataFrame res = sqlContext.sql("show tables"); res.show(); Here are the SW versions: Spark: 1.3 Hive: 1.2 Hadoop: 2.6 Thanks in advance for any suggestion.