Could you enable HistoryServer and provide the properties and CLASSPATH for the spark-shell? And 'env' command to list your environment variables?
By the way, what does the spark logs says? Enable debug mode to see what's going on in spark-shell when it tries to interact and init HiveContext. > On Jul 31, 2014, at 19:09, "chenjie" <chenjie2...@gmail.com> wrote: > > Hi, Yin and Andrew, thank you for your reply. > When I create table in hive cli, it works correctly and the table will be > found in hdfs. I forgot start hiveserver2 before and I started it today. > Then I run the command below: > spark-shell --master spark://192.168.40.164:7077 --driver-class-path > conf/hive-site.xml > Furthermore, I added the following command: > hiveContext.hql("SET > hive.metastore.warehouse.dir=hdfs://192.168.40.164:8020/user/hive/warehouse") > But then didn't work for me. I got the same exception as before and found > the table file in local directory instead of hdfs. > > > Yin Huai-2 wrote >> Another way is to set "hive.metastore.warehouse.dir" explicitly to the >> HDFS >> dir storing Hive tables by using SET command. For example: >> >> hiveContext.hql("SET >> hive.metastore.warehouse.dir=hdfs://localhost:54310/user/hive/warehouse") >> >> >> >> >> On Thu, Jul 31, 2014 at 8:05 AM, Andrew Lee < > >> alee526@ > >> > wrote: >> >>> Hi All, >>> >>> It has been awhile, but what I did to make it work is to make sure the >>> followings: >>> >>> 1. Hive is working when you run Hive CLI and JDBC via Hiveserver2 >>> >>> 2. Make sure you have the hive-site.xml from above Hive configuration. >>> The >>> problem here is that you want the hive-site.xml from the Hive metastore. >>> The one for Hive and HCatalog may be different files. Make sure you check >>> the xml properties in that file, pick the one that has the warehouse >>> property configured and the JDO setup. >>> >>> 3. Make sure hive-site.xml from step 2 is included in $SPARK_HOME/conf, >>> and in your runtime CLASSPATH when you run spark-shell >>> >>> 4. Use the history server to check the runtime CLASSPATH and order to >>> ensure hive-site.xml is included. >>> >>> HiveContext should pick up the hive-site.xml and talk to your running >>> hive >>> service. >>> >>> Hope these tips help. >>> >>>> On Jul 30, 2014, at 22:47, "chenjie" < > >> chenjie2001@ > >> > wrote: >>>> >>>> Hi, Michael. I Have the same problem. My warehouse directory is always >>>> created locally. I copied the default hive-site.xml into the >>>> $SPARK_HOME/conf directory on each node. After I executed the code >>> below, >>>> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) >>>> hiveContext.hql("CREATE TABLE IF NOT EXISTS src (key INT, value >>>> STRING)") >>>> hiveContext.hql("LOAD DATA LOCAL INPATH >>>> '/extdisk2/tools/spark/examples/src/main/resources/kv1.txt' INTO TABLE >>> src") >>>> hiveContext.hql("FROM src SELECT key, value").collect() >>>> >>>> I got the exception below: >>>> java.io.FileNotFoundException: File >>> file:/user/hive/warehouse/src/kv1.txt >>>> does not exist >>>> at >>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520) >>>> at >>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398) >>>> at >>> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker. >> <init> >> (ChecksumFileSystem.java:137) >>>> at >>> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339) >>>> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:763) >>>> at >>> org.apache.hadoop.mapred.LineRecordReader. >> <init> >> (LineRecordReader.java:106) >>>> at >>> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) >>>> at org.apache.spark.rdd.HadoopRDD$$anon$1. >> <init> >> (HadoopRDD.scala:193) >>>> >>>> At last, I found /user/hive/warehouse/src/kv1.txt was created on the >>> node >>>> where I start spark-shell. >>>> >>>> The spark that I used is pre-built spark1.0.1 for hadoop2. >>>> >>>> Thanks in advance. > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/HiveContext-is-creating-metastore-warehouse-locally-instead-of-in-hdfs-tp10838p11111.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.