[jira] [Resolved] (SPARK-16410) DataFrameWriter's jdbc method drops table in overwrite mode

Sean Owen (JIRA) Sun, 24 Jul 2016 01:26:30 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-16410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sean Owen resolved SPARK-16410.
-------------------------------
    Resolution: Duplicate

> DataFrameWriter's jdbc method drops table in overwrite mode
> -----------------------------------------------------------
>
>                 Key: SPARK-16410
>                 URL: https://issues.apache.org/jira/browse/SPARK-16410
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.1, 1.6.2
>            Reporter: Ian Hellstrom
>
> According to the [API 
> documentation|http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameWriter],
>  the write mode {{overwrite}} should _overwrite the existing data_, which 
> suggests that the data is removed, i.e. the table is truncated. 
> However, that is now what happens in the [source 
> code|https://github.com/apache/spark/blob/0ad6ce7e54b1d8f5946dde652fa5341d15059158/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala#L421]:
> {code}
> if (mode == SaveMode.Overwrite && tableExists) {
>         JdbcUtils.dropTable(conn, table)
>         tableExists = false
>       }
> {code}
> This clearly shows that the table is first dropped and then recreated. This 
> causes two major issues:
> * Existing indexes, partitioning schemes, etc. are completely lost.
> * The case of identifiers may be changed without the user understanding why.
> In my opinion, the table should be truncated, not dropped. Overwriting data 
> is a DML operation and should not cause DDL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-16410) DataFrameWriter's jdbc method drops table in overwrite mode

Reply via email to