This is an automated email from the ASF dual-hosted git repository.
stevel pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/hadoop.git
The following commit(s) were added to refs/heads/branch-3.4 by this push:
new ecc32943804 MAPREDUCE-7448. Skipping cleanup with FileOutputCommitter
V1 can corrupt output: warn and document (#6038)
ecc32943804 is described below
commit ecc32943804c3d7db8043683443a739851d482f6
Author: ConfX <[email protected]>
AuthorDate: Tue Oct 14 05:33:45 2025 -0500
MAPREDUCE-7448. Skipping cleanup with FileOutputCommitter V1 can corrupt
output: warn and document (#6038)
* update documentation for mapreduce committer
* add warning if the user attempts to use FileOutputCommiter V1 with
skipping cleanup
Contributed by ConfX
---
.../org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java | 5 +++++
.../src/site/markdown/manifest_committer.md | 1 +
2 files changed, 6 insertions(+)
diff --git
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java
b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java
index 82b7fcb5046..5adc1cd14d4 100644
---
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java
+++
b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java
@@ -158,6 +158,11 @@ public FileOutputCommitter(Path outputPath,
"output directory:" + skipCleanup + ", ignore cleanup failures: " +
ignoreCleanupFailures);
+ if (algorithmVersion == 1 && skipCleanup) {
+ LOG.warn("Skip cleaning up when using FileOutputCommitter V1 can lead
to unexpected behaviors. " +
+ "For example, committing several times may be allowed
falsely.");
+ }
+
if (outputPath != null) {
FileSystem fs = outputPath.getFileSystem(context.getConfiguration());
this.outputPath = fs.makeQualified(outputPath);
diff --git
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/manifest_committer.md
b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/manifest_committer.md
index 0ac03080195..3c2c5891bc2 100644
---
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/manifest_committer.md
+++
b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/manifest_committer.md
@@ -202,6 +202,7 @@ Here are the main configuration options of the committer.
There are some more, as covered in the (Advanced)[#advanced] section.
+WARNING: setting `mapreduce.fileoutputcommitter.cleanup.skipped` to `true` is
not compatible with version 1 of the committer and can cause unexpected
behaviors.
## <a name="scaling"></a> Scaling jobs
`mapreduce.manifest.committer.io.threads`
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]