[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7369?focusedWorklogId=776727&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776727
 ]

ASF GitHub Bot logged work on MAPREDUCE-7369:
---------------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Jun/22 09:15
            Start Date: 01/Jun/22 09:15
    Worklog Time Spent: 10m 
      Work Description: iwasakims commented on code in PR #4247:
URL: https://github.com/apache/hadoop/pull/4247#discussion_r886578663


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml:
##########
@@ -286,6 +286,13 @@
   </description>
 </property>
 
+<property>
+  <name>mapreduce.task.enable.ping-for-liveliness-check</name>
+  <value>true</value>

Review Comment:
   @ashutoshcipher Making this feature disabled by default sounds right 
direction as @cnauroth mentioned. While no unexpected side effect was observed 
in the test results of current patch, there should be additional test case 
covering the code path of new feature enabled.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 776727)
    Time Spent: 2h  (was: 1h 50m)

> MapReduce tasks timing out when spends more time on MultipleOutputs#close
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7369
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7369
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 3.3.1
>            Reporter: Prabhu Joseph
>            Assignee: Ashutosh Gupta
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: MAPREDUCE-7369.001.patch
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> MapReduce tasks timing out when spends more time on MultipleOutputs#close. 
> MultipleOutputs#closes takes more time when there are multiple files to be 
> closed & there is a high latency in closing a stream.
> {code}
> 2021-11-01 02:45:08,312 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1634949471086_61268_m_001115_0: 
> AttemptID:attempt_1634949471086_61268_m_001115_0 Timed out after 300 secs
> {code}
> MapReduce task timeout can be increased but it is tough to set the right 
> timeout value. The timeout can be disabled with 0 but that might lead to 
> hanging tasks not getting killed.
> The tasks are sending the ping every 3 seconds which are not honored by 
> ApplicationMaster. It expects the status information which won't be send 
> during MultipleOutputs#close. This jira is to add a config which considers 
> the ping from task as part of Task Liveliness Check in the ApplicationMaster.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to