[ 
https://issues.apache.org/jira/browse/SPARK-41094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengfei zhao updated SPARK-41094:
---------------------------------
    Affects Version/s:     (was: 3.3.1)

> The saveAsTable method fails to be executed, resulting in data file loss
> ------------------------------------------------------------------------
>
>                 Key: SPARK-41094
>                 URL: https://issues.apache.org/jira/browse/SPARK-41094
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.4
>            Reporter: pengfei zhao
>            Priority: Major
>
> We have a problem in the production environment. 
> The code is: df.write.mode(SaveMode.Overwrite).saveAsTable("xxx").
> When the saveAsTable method is executing, an executor exits due to OOM, 
> causing half of the data file to be written on the hdfs, but subsequent spark 
> retries fail again. 
> It is very similar to the scenario described in SPARK-22504, but it really 
> happened.
> I read the source code. Why does Spark need to delete the table first and 
> then execute the plan? What if the execution fails after deleting the table?
> I know the attitude of the community, but this method of deleting tables 
> first is too risky. Can we adopt the following processing methods like Hive,
> 1. WRITE: create and write data to tempTable
> 2. SWAP: swap temptable1 with targetTable by using rename operation
> 3. CLEAN: clean up old data



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to