[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929814#comment-17929814
 ] 

ASF GitHub Bot commented on MAPREDUCE-7500:
-------------------------------------------

steveloughran commented on PR #7425:
URL: https://github.com/apache/hadoop/pull/7425#issuecomment-2678822267

   I don't want to anywhere near that code as it is (a) critical and (b) and 
incredibly complicated co-recursive mix of two algorithms where you have to 
step though with a debugger to work out WTF is going wrong.
   
   It isn't suited to cloud storage and even with HDFS, it hits limits due to 
lack of parallelisation.
   
   So sorry, no, I don't want to touch this. There's just too much risk.
   
   At the same time, if we can speed up that manifest committer, there's appeal 
there. Glancing at the RenameFilesStage, it already remembers if a dir had to 
be created -and if so knows there's nothing at the far end. Otherwise it does 
that probe + delete. 
   
   An optimistic commit there may have benefits, especially with azure where 
the HEAD probe will double the IO load of any rename, and job commit can put a 
lot of strain on IO quotas.
   
   Can you take a look there?
   
   I'm going to recommend
   * start with base manifest committer and your normal workload
   * set up a dir `mapreduce.manifest.committer.summary.report.directory`. This 
will save the iostatistics summary of the job for viewing, including summaries 
of number and duration of delete calls.
   * see if you could make RenameFilesStage.commitOneFile more optimistic. 
Ultimately this'd have to be made optional, but for an experiment it'd be good 
to see what gains you get
   
   Test on HDFS -works well there *and is more performant than the older 
committer, due to the parallel renames*.
   
   A before/after test on abfs would be interesting too. ABFS is a special pain 
point here as it does have problems with rename under load; if that load can be 
reduced, then that's good. But if because the parent dirs are actually created, 
such as when committing into an empty directory tree, I wouldn't expect any 
change at all.
   




> Support optimistic file renames in the commit protocol
> ------------------------------------------------------
>
>                 Key: MAPREDUCE-7500
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7500
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: client
>         Environment: The commit protocol in FileOutputCommitter now supports 
> optimistic commits for files. This saves a FileSystem.getFileStatus call for 
> cases where it is unexpected to have conflict in the destination location at 
> commit time (e.g. Spark). This feature is disabled by default. To enable it 
> set mapreduce.fileoutputcommitter.optimistic.file.commit.enabled=true.
>            Reporter: Rob Reeves
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: flamegraph_commit.png
>
>
> During a file commit in FileOutputCommitter, it assumes a file may be in the 
> destination location and if so will delete it first. This means for every 
> file commit is calls FileSystem.getFileStatus for the destination. For the 
> Spark use case, there will be nothing existing in the destination location 
> for the expected case so the getFileStatus call is wasted in all, but 
> exceptional and unexpected cases.
> The getFileStatus call can take significant time. When I profiled a commit in 
> our environment (HDFS, intermittent latency issues) the 
> FileSystem.getFileStatus call takes 50% of the commit time. We have an 
> aggressive auto-msync setting, but even when I disabled msync I saw the same 
> behavior. I attached an example flame graph for the commit time 
> (getFileStatus time is highlighted in pink).
> To avoid the time spent on getFileStatus, there should be an option to 
> optimistically commit the file assuming there will be no conflict in the 
> destination.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to