[GitHub] spark issue #14932: [SPARK-17371] Resubmitted shuffle outputs can get delete...

JoshRosen Tue, 06 Sep 2016 16:20:21 -0700

Github user JoshRosen commented on the issue:

    https://github.com/apache/spark/pull/14932
  
    Context for other reviewers:
    
    As of #9610 (Spark 1.5.3+), map tasks write their output to temporary files 
and then atomically rename those files to put them at the final destination 
path. The `IndexShuffleBlockResolver.removeDataByMap()` call deletes the final 
map output files, not the temporary files. Therefore if a map task fails before 
these files are written then it does not need to call this cleanup method.
    
    Output files are renamed to their final locations via the 
`IndexShuffleBlockResolver.writeIndexFileAndCommit()` method which is called at 
the end of the shuffle write. If we assume that this method is atomic in case 
of failures (e.g. it doesn't leave any state around after failing) and also 
assume that no steps after `writeIndexFileAndCommit()` will fail then it should 
be fine to omit these `removeDataByMap` calls in the error-handling cases.
    
    Looking at `writeIndexFileAndCommit`, I think I spot a rare corner-case 
where it might behave non-atomically:
    
    ```
    else {
            // This is the first successful attempt in writing the map outputs 
for this task,
            // so override any existing index and data files with the ones we 
wrote.
            if (indexFile.exists()) {
              indexFile.delete()
            }
            if (dataFile.exists()) {
              dataFile.delete()
            }
            if (!indexTmp.renameTo(indexFile)) {
              throw new IOException("fail to rename file " + indexTmp + " to " 
+ indexFile)
            }
            if (dataTmp != null && dataTmp.exists() && 
!dataTmp.renameTo(dataFile)) {
              throw new IOException("fail to rename file " + dataTmp + " to " + 
dataFile)
            }
          }
    ```
    
    I suppose it's possible that the index rename could succeed while the data 
file rename could fail. In this case we should probably handle cleanup of the 
data temp file.
    
    Similarly, the existing code won't necessarily clean up `indexTmp` or 
`dataTmp` in error-conditions.
    
    I don't think that either of these are huge issues in practice, but they 
might be worth fixing as long as we're removing an additional layer of 
error-handling defense as part of this PR.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14932: [SPARK-17371] Resubmitted shuffle outputs can get delete...

Reply via email to