[ https://issues.apache.org/jira/browse/MAPREDUCE-7169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16770272#comment-16770272 ]
Zhaohui Xin edited comment on MAPREDUCE-7169 at 2/17/19 3:08 AM: ----------------------------------------------------------------- After YARN-6592, we can solve this problem easily using new resource request class, SchedulingRequest. Currently, maybe it's better just remove last attempt's node host in next resource request for speculation attempt. Just like, {code:java} speculationTaskAttempt.dataLocalHosts = speculationTaskAttempt.dataLocalHosts.remove(lastAttempt.host); //allocate resource for this speculation attempt. // code...{code} [~bibinchundatt], [~ajisakaa]. How do you think this? was (Author: uranus): After YARN-6592, it's easily to solve this problem using new resource request class, SchedulingRequest. Currently, maybe it's better just remove last attempt's node host in next resource request for speculation attempt. Just like, {code:java} speculationTaskAttempt.dataLocalHosts = speculationTaskAttempt.dataLocalHosts.remove(lastAttempt.host); //allocate resource for this speculation attempt. // code...{code} [~bibinchundatt], [~ajisakaa]. How do you think this? > Speculative attempts should not run on the same node > ---------------------------------------------------- > > Key: MAPREDUCE-7169 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7169 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: yarn > Affects Versions: 2.7.2 > Reporter: Lee chen > Assignee: Zhaohui Xin > Priority: Major > Attachments: image-2018-12-03-09-54-07-859.png > > > I found in all versions of yarn, Speculative Execution may set the > speculative task to the node of original task.What i have read is only it > will try to have one more task attempt. haven't seen any place mentioning not > on same node.It is unreasonable.If the node have some problems lead to tasks > execution will be very slow. and then placement the speculative task to same > node cannot help the problematic task. > In our cluster (version 2.7.2,2700 nodes),this phenomenon appear > almost everyday. > !image-2018-12-03-09-54-07-859.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org