LoggingResearch created MAPREDUCE-7486:
------------------------------------------
Summary: Handling Cluster Storage Capacity Exceeded Exception with
Enhanced Logging
Key: MAPREDUCE-7486
URL: https://issues.apache.org/jira/browse/MAPREDUCE-7486
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: mapreduce-client
Affects Versions: 3.3.6
Environment: Version: {{`3.3.6`}}
Location:
{{{}`hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java`{}}},
in {{`reportError`}} method, starting from Line 241-250.
Reporter: LoggingResearch
Attachments: TestYarnChild.java, original-vs-log-enhanced.md
The existing {{reportError}} method in {{YarnChild.java}} is responsible for
handling exceptions during job execution. However, when the exception is due to
the cluster storage capacity being exceeded, the method lacks sufficient
logging, especially in cases where the job is not configured to fast fail. This
can make it difficult for users to understand why a job did not fail
immediately when the storage capacity was exceeded. The enhancement adds
detailed logging to inform users about the configuration that prevents fast
failure.
*Expected Behavior:*
When a {{ClusterStorageCapacityExceededException}} is encountered, the system
should log whether the job is configured to fail fast. If fast fail is
disabled, the log should advise users on how to enable it.
*How-to-Fix:*
We propose to *expose such a relationship by logging.*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]