The warehouse location need to be specified before the |HiveContext| initialization, you can set it via:

|./bin/spark-sql --hiveconf 
hive.metastore.warehouse.dir=/home/spark/hive/warehouse
|

On 10/15/14 8:55 PM, Hao Ren wrote:

Hi,

The following query in sparkSQL 1.1.0 CLI doesn't work.

*SET hive.metastore.warehouse.dir=/home/spark/hive/warehouse
;

create table test as
select v1.*, v2.card_type, v2.card_upgrade_time_black,
v2.card_upgrade_time_gold
from customer v1 left join customer_loyalty v2
on v1.account_id = v2.account_id
limit 5
;*

StackTrack =>

org.apache.hadoop.hive.ql.metadata.HiveException:
MetaException(*message:file:/user/hive/warehouse/test* is not a directory or
unable to create one)
        at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:602)
        at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:559)
        at
org.apache.spark.sql.hive.HiveMetastoreCatalog.createTable(HiveMetastoreCatalog.scala:99)
        at
org.apache.spark.sql.hive.HiveMetastoreCatalog$CreateTables$anonfun$apply$1.applyOrElse(HiveMetastoreCatalog.scala:116)
        at
org.apache.spark.sql.hive.HiveMetastoreCatalog$CreateTables$anonfun$apply$1.applyOrElse(HiveMetastoreCatalog.scala:111)
        at
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:165)
        at
org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:156)
        at
org.apache.spark.sql.hive.HiveMetastoreCatalog$CreateTables$.apply(HiveMetastoreCatalog.scala:111)
        at
org.apache.spark.sql.hive.HiveContext$QueryExecution.optimizedPlan$lzycompute(HiveContext.scala:358)
        at
org.apache.spark.sql.hive.HiveContext$QueryExecution.optimizedPlan(HiveContext.scala:357)
        at
org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:402)
        at
org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:400)
        at
org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:406)
        at
org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:406)
        at
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:360)
        at
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:360)
        at 
org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
        at org.apache.spark.sql.SchemaRDD.<init>(SchemaRDD.scala:103)
        at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:98)
        at
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:58)
        at
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:291)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
        at
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226)
        at
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: MetaException(message:file:/user/hive/warehouse/test is not a
directory or unable to create one)
        at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1060)
        at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1107)
        at sun.reflect.GeneratedMethodAccessor133.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
        at com.sun.proxy.$Proxy15.create_table_with_environment_context(Unknown
Source)
        at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482)
        at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471)
        at sun.reflect.GeneratedMethodAccessor132.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
        at com.sun.proxy.$Proxy16.createTable(Unknown Source)
        at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:596)
        ... 30 more

It seems that CLI doesn't take the hive.metastore.warehouse.dir value when
creating table with "as select ...".
If just create the table, like "create table t (...)", and then load data
into it. It works fine.
It also works in scala code. I mean:

*hiveContext sql "create table t as select ..."*

Is this a intentional feature ?

Hao



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-set-hive-metastore-warehouse-dir-in-CLI-doesn-t-work-tp16495.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to