Karuppayya created SPARK-29324: ---------------------------------- Summary: saveAsTable with overwrite mode results in metadata loss Key: SPARK-29324 URL: https://issues.apache.org/jira/browse/SPARK-29324 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.0 Reporter: Karuppayya
{code:java} scala> spark.range(1).write.option("path", "file:///tmp/tbl").format("orc").saveAsTable("tbl") scala> spark.sql("desc extended tbl").collect.foreach(println) [id,bigint,null] [,,] [# Detailed Table Information,,] [Database,default,] [Table,tbl,] [Owner,karuppayyar,] [Created Time,Wed Oct 02 09:29:06 IST 2019,] [Last Access,UNKNOWN,] [Created By,Spark 3.0.0-SNAPSHOT,] [Type,EXTERNAL,] [Provider,orc,] [Location,file:/tmp/tbl_loc,] [Serde Library,org.apache.hadoop.hive.ql.io.orc.OrcSerde,] [InputFormat,org.apache.hadoop.hive.ql.io.orc.OrcInputFormat,] [OutputFormat,org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat,] {code} {code:java} // Overwriting table scala> spark.range(100).write.mode("overwrite").saveAsTable("tbl") scala> spark.sql("desc extended tbl").collect.foreach(println) [id,bigint,null] [,,] [# Detailed Table Information,,] [Database,default,] [Table,tbl,] [Owner,karuppayyar,] [Created Time,Wed Oct 02 09:30:36 IST 2019,] [Last Access,UNKNOWN,] [Created By,Spark 3.0.0-SNAPSHOT,] [Type,MANAGED,] [Provider,parquet,] [Location,file:/tmp/tbl,] [Serde Library,org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe,] [InputFormat,org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat,] [OutputFormat,org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat,] {code} The first code block creates an EXTERNAL table in Orc format The second code block overwrites it with more data After the overwrite, 1. The external table became a managed table. 2. The fileformat has changed from Orc to parquet(default fileformat). And other information(like owner, TBLPROPERTIES) are also overwritten. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org