We can set a path, refer to the unit tests. For example: df.saveAsTable("savedJsonTable", "org.apache.spark.sql.json", "append", path =tmpPath) <https://github.com/apache/spark/blob/master/python/pyspark/sql/tests.py>
Investigating some more, I found that the table is being created at the specified location, but the error is still being thrown, and the table has not been stored. This is the code that I ran: >>> a = [Row(key=k, value=str(k)) for k in range(100)] >>> df = sc.parallelize(a).toDF() >>> df.saveAsTable("savedJsonTable", "org.apache.spark.sql.json", "append", path="/tmp/test10") 15/03/27 10:45:13 ERROR RetryingHMSHandler: MetaException(message:file:/user/hive/warehouse/savedjsontable is not a directory or unable to create one) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1239) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1294) ... >>> sqlCtx.tables() DataFrame[tableName: string, isTemporary: boolean] >>> exit() ~> cat /tmp/test10/part-00000 {"key":0,"value":"0"} {"key":1,"value":"1"} {"key":2,"value":"2"} {"key":3,"value":"3"} {"key":4,"value":"4"} {"key":5,"value":"5"} Kind Regards, Tom On 27 March 2015 at 10:33, Yanbo Liang <yblia...@gmail.com> wrote: > "saveAsTable" will use the default data source configured by > spark.sql.sources.default. > > def saveAsTable(tableName: String): Unit = { > saveAsTable(tableName, SaveMode.ErrorIfExists) > } > > It can not set "path" if I understand correct. > > 2015-03-27 15:45 GMT+08:00 Tom Walwyn <twal...@gmail.com>: > >> Hi, >> >> The behaviour is the same for me in Scala and Python, so posting here in >> Python. When I use DataFrame.saveAsTable with the path option, I expect an >> external Hive table to be created at the specified path. Specifically, when >> I call: >> >> >>> df.saveAsTable(..., path="/tmp/test") >> >> I expect an external Hive table to be created pointing to /tmp/test which >> would contain the data in df. >> >> However, running locally on my Mac, I get an error indicating that Spark >> tried to create a managed table in the location of the Hive warehouse: >> >> ERROR RetryingHMSHandler: >> MetaException(message:file:/user/hive/warehouse/savetable is not a >> directory or unable to create one) >> at >> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1239) >> at >> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1294) >> >> Am I wrong to expect that Spark create an external table in this case? >> What is the expected behaviour of saveAsTable with the path option? >> >> Setup: running spark locally with spark 1.3.0. >> >> Kind Regards, >> Tom >> > >