[ https://issues.apache.org/jira/browse/MAPREDUCE-7500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929814#comment-17929814 ]
ASF GitHub Bot commented on MAPREDUCE-7500: ------------------------------------------- steveloughran commented on PR #7425: URL: https://github.com/apache/hadoop/pull/7425#issuecomment-2678822267 I don't want to anywhere near that code as it is (a) critical and (b) and incredibly complicated co-recursive mix of two algorithms where you have to step though with a debugger to work out WTF is going wrong. It isn't suited to cloud storage and even with HDFS, it hits limits due to lack of parallelisation. So sorry, no, I don't want to touch this. There's just too much risk. At the same time, if we can speed up that manifest committer, there's appeal there. Glancing at the RenameFilesStage, it already remembers if a dir had to be created -and if so knows there's nothing at the far end. Otherwise it does that probe + delete. An optimistic commit there may have benefits, especially with azure where the HEAD probe will double the IO load of any rename, and job commit can put a lot of strain on IO quotas. Can you take a look there? I'm going to recommend * start with base manifest committer and your normal workload * set up a dir `mapreduce.manifest.committer.summary.report.directory`. This will save the iostatistics summary of the job for viewing, including summaries of number and duration of delete calls. * see if you could make RenameFilesStage.commitOneFile more optimistic. Ultimately this'd have to be made optional, but for an experiment it'd be good to see what gains you get Test on HDFS -works well there *and is more performant than the older committer, due to the parallel renames*. A before/after test on abfs would be interesting too. ABFS is a special pain point here as it does have problems with rename under load; if that load can be reduced, then that's good. But if because the parent dirs are actually created, such as when committing into an empty directory tree, I wouldn't expect any change at all. > Support optimistic file renames in the commit protocol > ------------------------------------------------------ > > Key: MAPREDUCE-7500 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7500 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client > Environment: The commit protocol in FileOutputCommitter now supports > optimistic commits for files. This saves a FileSystem.getFileStatus call for > cases where it is unexpected to have conflict in the destination location at > commit time (e.g. Spark). This feature is disabled by default. To enable it > set mapreduce.fileoutputcommitter.optimistic.file.commit.enabled=true. > Reporter: Rob Reeves > Priority: Minor > Labels: pull-request-available > Attachments: flamegraph_commit.png > > > During a file commit in FileOutputCommitter, it assumes a file may be in the > destination location and if so will delete it first. This means for every > file commit is calls FileSystem.getFileStatus for the destination. For the > Spark use case, there will be nothing existing in the destination location > for the expected case so the getFileStatus call is wasted in all, but > exceptional and unexpected cases. > The getFileStatus call can take significant time. When I profiled a commit in > our environment (HDFS, intermittent latency issues) the > FileSystem.getFileStatus call takes 50% of the commit time. We have an > aggressive auto-msync setting, but even when I disabled msync I saw the same > behavior. I attached an example flame graph for the commit time > (getFileStatus time is highlighted in pink). > To avoid the time spent on getFileStatus, there should be an option to > optimistically commit the file assuming there will be no conflict in the > destination. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org