Another way is to set "hive.metastore.warehouse.dir" explicitly to the HDFS dir storing Hive tables by using SET command. For example:
hiveContext.hql("SET hive.metastore.warehouse.dir=hdfs://localhost:54310/user/hive/warehouse") On Thu, Jul 31, 2014 at 8:05 AM, Andrew Lee <alee...@hotmail.com> wrote: > Hi All, > > It has been awhile, but what I did to make it work is to make sure the > followings: > > 1. Hive is working when you run Hive CLI and JDBC via Hiveserver2 > > 2. Make sure you have the hive-site.xml from above Hive configuration. The > problem here is that you want the hive-site.xml from the Hive metastore. > The one for Hive and HCatalog may be different files. Make sure you check > the xml properties in that file, pick the one that has the warehouse > property configured and the JDO setup. > > 3. Make sure hive-site.xml from step 2 is included in $SPARK_HOME/conf, > and in your runtime CLASSPATH when you run spark-shell > > 4. Use the history server to check the runtime CLASSPATH and order to > ensure hive-site.xml is included. > > HiveContext should pick up the hive-site.xml and talk to your running hive > service. > > Hope these tips help. > > > On Jul 30, 2014, at 22:47, "chenjie" <chenjie2...@gmail.com> wrote: > > > > Hi, Michael. I Have the same problem. My warehouse directory is always > > created locally. I copied the default hive-site.xml into the > > $SPARK_HOME/conf directory on each node. After I executed the code below, > > val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) > > hiveContext.hql("CREATE TABLE IF NOT EXISTS src (key INT, value > > STRING)") > > hiveContext.hql("LOAD DATA LOCAL INPATH > > '/extdisk2/tools/spark/examples/src/main/resources/kv1.txt' INTO TABLE > src") > > hiveContext.hql("FROM src SELECT key, value").collect() > > > > I got the exception below: > > java.io.FileNotFoundException: File file:/user/hive/warehouse/src/kv1.txt > > does not exist > > at > > > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520) > > at > > > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398) > > at > > > org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:137) > > at > > org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339) > > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:763) > > at > > > org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:106) > > at > > > org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) > > at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193) > > > > At last, I found /user/hive/warehouse/src/kv1.txt was created on the node > > where I start spark-shell. > > > > The spark that I used is pre-built spark1.0.1 for hadoop2. > > > > Thanks in advance. > > > > > > Michael Armbrust wrote > >> The warehouse and the metastore directories are two different things. > The > >> metastore holds the schema information about the tables and will by > >> default > >> be a local directory. With javax.jdo.option.ConnectionURL you can > >> configure it to be something like mysql. The warehouse directory is the > >> default location where the actual contents of the tables is stored. > What > >> directory are seeing created locally? > > > > > > > > > > > > -- > > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/HiveContext-is-creating-metastore-warehouse-locally-instead-of-in-hdfs-tp10838p11024.html > > Sent from the Apache Spark User List mailing list archive at Nabble.com. >