Re: Spark2 DataFrameWriter.saveAsTable defaults to external table if path is provided

2019-02-13 Thread Chris Teoh
Thanks Peter. I'm not sure if that is possible yet. The closest I can think of to achieving what you want is to try something like:- df.registerTempTable("mytable") sql("create table mymanagedtable as select * from mytable") I haven't used CTAS in Spark SQL before but have heard it works. This

Re: Spark2 DataFrameWriter.saveAsTable defaults to external table if path is provided

2019-02-13 Thread Horváth Péter Gergely
Hi Chris, Thank you for the input, I know I can always write the table DDL manually. But here I would like to rely on Spark generating the schema. What I don't understand is the change in the behaviour of Spark: having the storage path specified does not necessarily mean it should be an external

Re: Spark2 DataFrameWriter.saveAsTable defaults to external table if path is provided

2019-02-13 Thread Chris Teoh
Hey there, Could you not just create a managed table using the DDL in Spark SQL and then written the data frame to the underlying folder or use Spark SQL to do an insert? Alternatively try create table as select. Iirc hive creates managed tables this way. I've not confirmed this works but I

Spark2 DataFrameWriter.saveAsTable defaults to external table if path is provided

2019-02-13 Thread Horváth Péter Gergely
Dear All, I am facing a strange issue with Spark 2.3, where I would like to create a MANAGED table out of the content of a DataFrame with the storage path overridden. Apparently, when one tries to create a Hive table via DataFrameWriter.saveAsTable, supplying a "path" option causes Spark to