We're having some trouble with the amount of counters that Crunch creates
when writing to a lot of different output files (slightly more than 120).
This wouldn't be an issue if we were able to configure the maximum number
of allowed counters but unfortunately, because we are running an older
version of Hadoop, doing this is not an option and we are required to patch
Crunch locally when using a new release to leave out the counters. The
required patch (one line...) can be found in the attachment.
I'm not saying the counters should be removed but maybe it is an option to
make them configurable without paying too much of a performance penalty?
Regards,
Dominique Dierickx
diff --git a/crunch-core/src/main/java/org/apache/crunch/io/CrunchOutputs.java
b/crunch-core/src/main/java/org/apache/crunch/io/CrunchOutputs.java
index ccf4fb5..71e3bbe 100644
--- a/crunch-core/src/main/java/org/apache/crunch/io/CrunchOutputs.java
+++ b/crunch-core/src/main/java/org/apache/crunch/io/CrunchOutputs.java
@@ -125,7 +125,7 @@ public class CrunchOutputs<K, V> {
namedOutput + "'");
}
TaskAttemptContext taskContext = getContext(namedOutput);
- baseContext.getCounter(COUNTERS_GROUP, namedOutput).increment(1);
+ //baseContext.getCounter(COUNTERS_GROUP, namedOutput).increment(1);
getRecordWriter(taskContext, namedOutput).write(key, value);
}