Re: HiveContext is creating metastore warehouse locally instead of in hdfs

chenjie Thu, 31 Jul 2014 19:09:48 -0700

Hi, Yin and Andrew, thank you for your reply.
When I create table in hive cli, it works correctly and the table will be
found in hdfs. I forgot start hiveserver2 before and I started it today.
Then I run the command below:
    spark-shell --master spark://192.168.40.164:7077  --driver-class-path
conf/hive-site.xml
Furthermore, I added the following command:
    hiveContext.hql("SET
hive.metastore.warehouse.dir=hdfs://192.168.40.164:8020/user/hive/warehouse")
But then didn't work for me. I got the same exception as before and found
the table file in local directory instead of hdfs.



Yin Huai-2 wrote
> Another way is to set "hive.metastore.warehouse.dir" explicitly to the
> HDFS
> dir storing Hive tables by using SET command. For example:
> 
> hiveContext.hql("SET
> hive.metastore.warehouse.dir=hdfs://localhost:54310/user/hive/warehouse")
> 
> 
> 
> 
> On Thu, Jul 31, 2014 at 8:05 AM, Andrew Lee &lt;

> alee526@

> &gt; wrote:
> 
>> Hi All,
>>
>> It has been awhile, but what I did to make it work is to make sure the
>> followings:
>>
>> 1. Hive is working when you run Hive CLI and JDBC via Hiveserver2
>>
>> 2. Make sure you have the hive-site.xml from above Hive configuration.
>> The
>> problem here is that you want the hive-site.xml from the Hive metastore.
>> The one for Hive and HCatalog may be different files. Make sure you check
>> the xml properties in that file, pick the one that has the warehouse
>> property configured and the JDO setup.
>>
>> 3. Make sure hive-site.xml from step 2 is included in $SPARK_HOME/conf,
>> and in your runtime CLASSPATH when you run spark-shell
>>
>> 4. Use the history server to check the runtime CLASSPATH and order to
>> ensure hive-site.xml is included.
>>
>> HiveContext should pick up the hive-site.xml and talk to your running
>> hive
>> service.
>>
>> Hope these tips help.
>>
>> > On Jul 30, 2014, at 22:47, "chenjie" &lt;

> chenjie2001@

> &gt; wrote:
>> >
>> > Hi, Michael. I Have the same problem. My warehouse directory is always
>> > created locally. I copied the default hive-site.xml into the
>> > $SPARK_HOME/conf directory on each node. After I executed the code
>> below,
>> >    val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>> >    hiveContext.hql("CREATE TABLE IF NOT EXISTS src (key INT, value
>> > STRING)")
>> >    hiveContext.hql("LOAD DATA LOCAL INPATH
>> > '/extdisk2/tools/spark/examples/src/main/resources/kv1.txt' INTO TABLE
>> src")
>> >    hiveContext.hql("FROM src SELECT key, value").collect()
>> >
>> > I got the exception below:
>> > java.io.FileNotFoundException: File
>> file:/user/hive/warehouse/src/kv1.txt
>> > does not exist
>> >    at
>> >
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520)
>> >    at
>> >
>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398)
>> >    at
>> >
>> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.
> <init>
> (ChecksumFileSystem.java:137)
>> >    at
>> >
>> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339)
>> >    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:763)
>> >    at
>> >
>> org.apache.hadoop.mapred.LineRecordReader.
> <init>
> (LineRecordReader.java:106)
>> >    at
>> >
>> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>> >    at org.apache.spark.rdd.HadoopRDD$$anon$1.
> <init>
> (HadoopRDD.scala:193)
>> >
>> > At last, I found /user/hive/warehouse/src/kv1.txt was created on the
>> node
>> > where I start spark-shell.
>> >
>> > The spark that I used is pre-built spark1.0.1 for hadoop2.
>> >
>> > Thanks in advance.
>> >
>> >





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/HiveContext-is-creating-metastore-warehouse-locally-instead-of-in-hdfs-tp10838p11111.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: HiveContext is creating metastore warehouse locally instead of in hdfs

Reply via email to