[jira] [Created] (MAPREDUCE-7369) MapReduce tasks timing out when spends more time on MultipleOutputs#close

2021-11-18 Thread Prabhu Joseph (Jira)
Prabhu Joseph created MAPREDUCE-7369:


 Summary: MapReduce tasks timing out when spends more time on 
MultipleOutputs#close
 Key: MAPREDUCE-7369
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7369
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.3.1
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


MapReduce tasks timing out when spends more time on MultipleOutputs#close. 
MultipleOutputs#closes takes more time when there are multiple files to be 
closed & there is a high latency in closing a stream.

{code}
2021-11-01 02:45:08,312 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report 
from attempt_1634949471086_61268_m_001115_0: 
AttemptID:attempt_1634949471086_61268_m_001115_0 Timed out after 300 secs
{code}

MapReduce task timeout can be increased but it is tough to set the right 
timeout value. The timeout can be disabled with 0 but that might lead to 
hanging tasks not getting killed.

The tasks are sending the ping every 3 seconds which are not honored by 
ApplicationMaster. It expects the status information which won't be send during 
MultipleOutputs#close. This jira is to add a config which considers the ping 
from task as part of Task Liveliness Check in the ApplicationMaster.








--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7169) Speculative attempts should not run on the same node

2021-11-18 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445997#comment-17445997
 ] 

Ahmed Hussein commented on MAPREDUCE-7169:
--

bq. Why is denying racks and hosts should be enabled separately? Can you please 
elaborate? Currently we try to avoid launching on same rack as old attempt if 
there are no containers on diff rack then we try choosing node other than old 
attempt node.

Hi [~BilwaST], I second  [~jeagles] about separating the rack and the hosts.
Considering the block placement policy on Hadoop, launching the speculative 
attempt on a different rack every-time will be a double edge sword. on same 
rack will be a double edge sword.
leaving it to be configurable will give much more flexibility to work along 
with block placements. Otherwise, the speculative changes will be "All or 
nothing: Always run on a different rack or disable it and speculate on the same 
node ".

> Speculative attempts should not run on the same node
> 
>
> Key: MAPREDUCE-7169
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7169
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: yarn
>Affects Versions: 2.7.2
>Reporter: Lee chen
>Assignee: Bilwa S T
>Priority: Major
> Attachments: MAPREDUCE-7169-001.patch, MAPREDUCE-7169-002.patch, 
> MAPREDUCE-7169-003.patch, MAPREDUCE-7169.004.patch, MAPREDUCE-7169.005.patch, 
> MAPREDUCE-7169.006.patch, MAPREDUCE-7169.007.patch, 
> image-2018-12-03-09-54-07-859.png
>
>
>   I found in all versions of yarn, Speculative Execution may set the 
> speculative task to the node of  original task.What i have read is only it 
> will try to have one more task attempt. haven't seen any place mentioning not 
> on same node.It is unreasonable.If the node have some problems lead to tasks 
> execution will be very slow. and then placement the speculative  task to same 
> node cannot help the  problematic task.
>  In our cluster (version 2.7.2,2700 nodes),this phenomenon appear 
> almost everyday.
>  !image-2018-12-03-09-54-07-859.png! 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org