[ 
https://issues.apache.org/jira/browse/SPARK-48292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866966#comment-17866966
 ] 

Steve Loughran commented on SPARK-48292:
----------------------------------------

what happens if a TA is authorized to commit, but doesn't return? as a network 
partition can trigger this. the output file may appear consistent with the 
committed task after a second tasks is told to commit its TA, but the 
partitioned TA may commit later? the core mapreduce commit protocols say 
"exactly one of the TAs shall have its output committed" but don't guarantee it 
is the second one

> Revert [SPARK-39195][SQL] Spark OutputCommitCoordinator should abort stage 
> when committed file not consistent with task status
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-48292
>                 URL: https://issues.apache.org/jira/browse/SPARK-48292
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: L. C. Hsieh
>            Assignee: angerszhu
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 4.0.0, 3.5.2, 3.4.4
>
>
> When a task attemp fails but it is authorized to do task commit, 
> OutputCommitCoordinator will make the stage failed with a reason message 
> which says that task commit success, but actually the driver never knows if a 
> task commit is successful or not. We should update the reason message to make 
> it less confused.
> See https://github.com/apache/spark/pull/36564#discussion_r1598660630



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to