Hi, Yin and Andrew, thank you for your reply. When I create table in hive cli, it works correctly and the table will be found in hdfs. I forgot start hiveserver2 before and I started it today. Then I run the command below: spark-shell --master spark://192.168.40.164:7077 --driver-class-path conf/hive-site.xml Furthermore, I added the following command: hiveContext.hql("SET hive.metastore.warehouse.dir=hdfs://192.168.40.164:8020/user/hive/warehouse") But then didn't work for me. I got the same exception as before and found the table file in local directory instead of hdfs.
Yin Huai-2 wrote > Another way is to set "hive.metastore.warehouse.dir" explicitly to the > HDFS > dir storing Hive tables by using SET command. For example: > > hiveContext.hql("SET > hive.metastore.warehouse.dir=hdfs://localhost:54310/user/hive/warehouse") > > > > > On Thu, Jul 31, 2014 at 8:05 AM, Andrew Lee < > alee526@ > > wrote: > >> Hi All, >> >> It has been awhile, but what I did to make it work is to make sure the >> followings: >> >> 1. Hive is working when you run Hive CLI and JDBC via Hiveserver2 >> >> 2. Make sure you have the hive-site.xml from above Hive configuration. >> The >> problem here is that you want the hive-site.xml from the Hive metastore. >> The one for Hive and HCatalog may be different files. Make sure you check >> the xml properties in that file, pick the one that has the warehouse >> property configured and the JDO setup. >> >> 3. Make sure hive-site.xml from step 2 is included in $SPARK_HOME/conf, >> and in your runtime CLASSPATH when you run spark-shell >> >> 4. Use the history server to check the runtime CLASSPATH and order to >> ensure hive-site.xml is included. >> >> HiveContext should pick up the hive-site.xml and talk to your running >> hive >> service. >> >> Hope these tips help. >> >> > On Jul 30, 2014, at 22:47, "chenjie" < > chenjie2001@ > > wrote: >> > >> > Hi, Michael. I Have the same problem. My warehouse directory is always >> > created locally. I copied the default hive-site.xml into the >> > $SPARK_HOME/conf directory on each node. After I executed the code >> below, >> > val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) >> > hiveContext.hql("CREATE TABLE IF NOT EXISTS src (key INT, value >> > STRING)") >> > hiveContext.hql("LOAD DATA LOCAL INPATH >> > '/extdisk2/tools/spark/examples/src/main/resources/kv1.txt' INTO TABLE >> src") >> > hiveContext.hql("FROM src SELECT key, value").collect() >> > >> > I got the exception below: >> > java.io.FileNotFoundException: File >> file:/user/hive/warehouse/src/kv1.txt >> > does not exist >> > at >> > >> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520) >> > at >> > >> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398) >> > at >> > >> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker. > <init> > (ChecksumFileSystem.java:137) >> > at >> > >> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339) >> > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:763) >> > at >> > >> org.apache.hadoop.mapred.LineRecordReader. > <init> > (LineRecordReader.java:106) >> > at >> > >> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) >> > at org.apache.spark.rdd.HadoopRDD$$anon$1. > <init> > (HadoopRDD.scala:193) >> > >> > At last, I found /user/hive/warehouse/src/kv1.txt was created on the >> node >> > where I start spark-shell. >> > >> > The spark that I used is pre-built spark1.0.1 for hadoop2. >> > >> > Thanks in advance. >> > >> > -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/HiveContext-is-creating-metastore-warehouse-locally-instead-of-in-hdfs-tp10838p11111.html Sent from the Apache Spark User List mailing list archive at Nabble.com.