Hi all,
I've read the docs for Spark SQL 2.1.0 but I'm still having issues with the
warehouse and related details.
I'm not using Hive proper, so my hive-site.xml consists only of:
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/mnt/data/spark/metastore_db;create=true</value>
I've set "spark.sql.warehouse.dir" in my "spark-defaults.conf", however the
location in my catalog doesn't match:
scala> spark.conf.get("spark.sql.warehouse.dir")
res8: String = file://mnt/data/spark/warehouse
scala> spark.conf.get("hive.metastore.warehouse.dir")
res9: String = file://mnt/data/spark/warehouse
scala> spark.catalog.listDatabases.show(false)
+-------+---------------------+-----------------------------+
|name |description |locationUri |
+-------+---------------------+-----------------------------+
|default|Default Hive database|file:/home/me/spark-warehouse|
+-------+---------------------+-----------------------------+
I've also tried setting "spark.sql.warehouse.dir" to a valid HDFS path to no
avail.
My application loads both ORC tables and AVRO files (using spark-avro) from
HDFS.
When I load a table using spark.sql("select * from orc.`my-table-in-hdfs`"), I
see WARN ObjectStore: Failed to get database orc, returning NoSuchObjectException.
When I load an AVRO file from HDFS using spark.read.avro(filename) , I see WARN
DataSource: Error while looking for metadata directory.
Any ideas as to what I'm doing wrong?
--
Joe Naegele
Grier Forensics
410.220.0968
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]