Re: HiveContext is creating metastore warehouse locally instead of in hdfs

Yin Huai Thu, 31 Jul 2014 10:17:52 -0700

Another way is to set "hive.metastore.warehouse.dir" explicitly to the HDFS
dir storing Hive tables by using SET command. For example:


hiveContext.hql("SET
hive.metastore.warehouse.dir=hdfs://localhost:54310/user/hive/warehouse")




On Thu, Jul 31, 2014 at 8:05 AM, Andrew Lee <alee...@hotmail.com> wrote:

> Hi All,
>
> It has been awhile, but what I did to make it work is to make sure the
> followings:
>
> 1. Hive is working when you run Hive CLI and JDBC via Hiveserver2
>
> 2. Make sure you have the hive-site.xml from above Hive configuration. The
> problem here is that you want the hive-site.xml from the Hive metastore.
> The one for Hive and HCatalog may be different files. Make sure you check
> the xml properties in that file, pick the one that has the warehouse
> property configured and the JDO setup.
>
> 3. Make sure hive-site.xml from step 2 is included in $SPARK_HOME/conf,
> and in your runtime CLASSPATH when you run spark-shell
>
> 4. Use the history server to check the runtime CLASSPATH and order to
> ensure hive-site.xml is included.
>
> HiveContext should pick up the hive-site.xml and talk to your running hive
> service.
>
> Hope these tips help.
>
> > On Jul 30, 2014, at 22:47, "chenjie" <chenjie2...@gmail.com> wrote:
> >
> > Hi, Michael. I Have the same problem. My warehouse directory is always
> > created locally. I copied the default hive-site.xml into the
> > $SPARK_HOME/conf directory on each node. After I executed the code below,
> >    val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> >    hiveContext.hql("CREATE TABLE IF NOT EXISTS src (key INT, value
> > STRING)")
> >    hiveContext.hql("LOAD DATA LOCAL INPATH
> > '/extdisk2/tools/spark/examples/src/main/resources/kv1.txt' INTO TABLE
> src")
> >    hiveContext.hql("FROM src SELECT key, value").collect()
> >
> > I got the exception below:
> > java.io.FileNotFoundException: File file:/user/hive/warehouse/src/kv1.txt
> > does not exist
> >    at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520)
> >    at
> >
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398)
> >    at
> >
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:137)
> >    at
> > org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339)
> >    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:763)
> >    at
> >
> org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:106)
> >    at
> >
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
> >    at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193)
> >
> > At last, I found /user/hive/warehouse/src/kv1.txt was created on the node
> > where I start spark-shell.
> >
> > The spark that I used is pre-built spark1.0.1 for hadoop2.
> >
> > Thanks in advance.
> >
> >
> > Michael Armbrust wrote
> >> The warehouse and the metastore directories are two different things.
>  The
> >> metastore holds the schema information about the tables and will by
> >> default
> >> be a local directory.  With javax.jdo.option.ConnectionURL you can
> >> configure it to be something like mysql.  The warehouse directory is the
> >> default location where the actual contents of the tables is stored.
>  What
> >> directory are seeing created locally?
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/HiveContext-is-creating-metastore-warehouse-locally-instead-of-in-hdfs-tp10838p11024.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: HiveContext is creating metastore warehouse locally instead of in hdfs

Reply via email to