[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939976#comment-14939976
 ] 

Jason Lowe commented on MAPREDUCE-6302:
---------------------------------------

I think it's reasonable.  There were a number of separate bugs in this area 
because it was complicated, would be nice to see it simplified and easier to 
understand.

Do we really want to avoid any kind of preemption if there's a map running?  
Thinking of a case where a node failure causes 20 maps to line up for 
scheduling due to fetch failures and we only have one running.  Do we really 
want to feed those 20 maps through the one map hole?  Hope they don't run very 
long.  ;-)  I haven't studied what the original code did in this case, but I 
noticed it did not early-out if maps were running, hence the question.  I think 
the preemption logic could benefit from knowing whether reducers have reported 
whether they're past the SHUFFLE phase and exempt them from preemption.  Seems 
we would want to preempt as many reducers in the SHUFFLE phase as necessary to 
run most or all pending maps in parallel if possible to minimize job latency 
for most cases.

Other minor comments on the patch:
- docs for mapreduce.job.reducer.unconditional-preempt.delay.sec should be 
clear on how to disable the functionality if desired, since setting it to zero 
does some pretty bad things.
- preemtping s/b preempting


> Incorrect headroom can lead to a deadlock between map and reduce allocations 
> -----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6302
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6302
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: mai shurong
>            Assignee: Karthik Kambatla
>            Priority: Critical
>         Attachments: AM_log_head100000.txt.gz, AM_log_tail100000.txt.gz, 
> log.txt, mr-6302-1.patch, mr-6302-2.patch, mr-6302-3.patch, mr-6302-4.patch, 
> mr-6302-prelim.patch, queue_with_max163cores.png, queue_with_max263cores.png, 
> queue_with_max333cores.png
>
>
> I submit a  big job, which has 500 maps and 350 reduce, to a 
> queue(fairscheduler) with 300 max cores. When the big mapreduce job is 
> running 100% maps, the 300 reduces have occupied 300 max cores in the queue. 
> And then, a map fails and retry, waiting for a core, while the 300 reduces 
> are waiting for failed map to finish. So a deadlock occur. As a result, the 
> job is blocked, and the later job in the queue cannot run because no 
> available cores in the queue.
> I think there is the similar issue for memory of a queue .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to