[
https://issues.apache.org/jira/browse/MAPREDUCE-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shilun Fan updated MAPREDUCE-7272:
----------------------------------
Component/s: test
Hadoop Flags: Reviewed
Target Version/s: 2.10.1, 3.2.2, 3.1.4, 3.3.0, 3.4.0
Affects Version/s: 2.10.1
3.2.2
3.1.4
3.3.0
3.4.0
> TaskAttemptListenerImpl excessive log messages
> ----------------------------------------------
>
> Key: MAPREDUCE-7272
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7272
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: test
> Affects Versions: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 3.4.0
> Reporter: Ahmed Hussein
> Assignee: Ahmed Hussein
> Priority: Major
> Fix For: 2.8.6, 3.3.0, 2.9.3, 3.1.4, 3.2.2, 2.10.1, 3.4.0
>
> Attachments: MAPREDUCE-7272-branch-2.10.001.patch,
> MAPREDUCE-7272-branch-2.10.002.patch, MAPREDUCE-7272-branch-2.10.003.patch,
> MAPREDUCE-7272-branch-2.10.004.patch, MAPREDUCE-7272.001.patch,
> MAPREDUCE-7272.002.patch, MAPREDUCE-7272.003.patch, MAPREDUCE-7272.004.patch
>
>
> {{TaskAttemptListenerImpl.statusUpdate()}} causes a bloating in log files.
> One every call, the listener uses {{LOG.info()}} to printout the progress of
> the {{TaskAttempt}}.
> {code:java}
> taskAttemptStatus.progress = taskStatus.getProgress();
> LOG.info("Progress of TaskAttempt " + taskAttemptID + " is : "
> + taskStatus.getProgress());
> {code}
>
> {code:bash}
> 2020-04-07 10:20:50,708 INFO [IPC Server handler 17 on 43926]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
> attempt_1586003420099_716645_m_007783_0 is : 0.40713295
> 2020-04-07 10:20:50,717 INFO [IPC Server handler 7 on 43926]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
> attempt_1586003420099_716645_m_020681_0 is : 0.55573714
> 2020-04-07 10:20:50,717 INFO [IPC Server handler 26 on 43926]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
> attempt_1586003420099_716645_m_024371_0 is : 0.54190344
> 2020-04-07 10:20:50,738 INFO [IPC Server handler 15 on 43926]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
> attempt_1586003420099_716645_m_033182_0 is : 0.50264555
> 2020-04-07 10:20:50,748 INFO [IPC Server handler 3 on 43926]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
> attempt_1586003420099_716645_m_022375_0 is : 0.5495565
> {code}
> After discussing this issue with [~nroberts], [~ebadger], and [~epayne], we
> thought that while it is helpful to have a log print of task progress, it is
> still excessive to log the progress in every update.
> This Jira is to suppress the excessive logging from TaskAttemptListener
> without affecting the frequency of progress updates.
> There are two flags:
> * {{-Dmapreduce.task.log.progress.delta.threshold=0.10}}: means that the
> task progress will be logged every 10% of delta progress. Default is 5%.
> * {{-Dmapreduce.task.log.progress.wait.interval-seconds=120}}: means that if
> the listener will log the progress every 2 minutes. This is helpful for long
> running tasks that take long time to achieve the delta threshold. Default is
> 1 minute.
> The listener will long whichever of {{delta.threshold}} and
> {{wait.interval-seconds}} is reached first.
> Enabling {{LOG.DEBUG}} for {{TaskAttemptListenerImpl}} will override
> those two flags and log the task progress on every update.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]