[ https://issues.apache.org/jira/browse/MAPREDUCE-7369?focusedWorklogId=776727&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776727 ]
ASF GitHub Bot logged work on MAPREDUCE-7369: --------------------------------------------- Author: ASF GitHub Bot Created on: 01/Jun/22 09:15 Start Date: 01/Jun/22 09:15 Worklog Time Spent: 10m Work Description: iwasakims commented on code in PR #4247: URL: https://github.com/apache/hadoop/pull/4247#discussion_r886578663 ########## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml: ########## @@ -286,6 +286,13 @@ </description> </property> +<property> + <name>mapreduce.task.enable.ping-for-liveliness-check</name> + <value>true</value> Review Comment: @ashutoshcipher Making this feature disabled by default sounds right direction as @cnauroth mentioned. While no unexpected side effect was observed in the test results of current patch, there should be additional test case covering the code path of new feature enabled. Issue Time Tracking ------------------- Worklog Id: (was: 776727) Time Spent: 2h (was: 1h 50m) > MapReduce tasks timing out when spends more time on MultipleOutputs#close > ------------------------------------------------------------------------- > > Key: MAPREDUCE-7369 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7369 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 3.3.1 > Reporter: Prabhu Joseph > Assignee: Ashutosh Gupta > Priority: Major > Labels: pull-request-available > Attachments: MAPREDUCE-7369.001.patch > > Time Spent: 2h > Remaining Estimate: 0h > > MapReduce tasks timing out when spends more time on MultipleOutputs#close. > MultipleOutputs#closes takes more time when there are multiple files to be > closed & there is a high latency in closing a stream. > {code} > 2021-11-01 02:45:08,312 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1634949471086_61268_m_001115_0: > AttemptID:attempt_1634949471086_61268_m_001115_0 Timed out after 300 secs > {code} > MapReduce task timeout can be increased but it is tough to set the right > timeout value. The timeout can be disabled with 0 but that might lead to > hanging tasks not getting killed. > The tasks are sending the ping every 3 seconds which are not honored by > ApplicationMaster. It expects the status information which won't be send > during MultipleOutputs#close. This jira is to add a config which considers > the ping from task as part of Task Liveliness Check in the ApplicationMaster. -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org