[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067193#comment-13067193
 ] 

Robert Joseph Evans commented on MAPREDUCE-2324:
------------------------------------------------

I uploaded a patch a while ago and the conversation has kind of died off.  Can 
someone please review the patch and give me some feedback on it.  If it is 
something that you don't want to put into a sustaining release at this time 
then please give me some feedback possibly with a -1, depending on how adamant 
you are about it, so I can address those issues perhaps by fixing it just in 
0.23 instead.

> Job should fail if a reduce task can't be scheduled anywhere
> ------------------------------------------------------------
>
>                 Key: MAPREDUCE-2324
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2324
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.2, 0.20.205.0
>            Reporter: Todd Lipcon
>            Assignee: Robert Joseph Evans
>         Attachments: MR-2324-security-v1.txt
>
>
> If there's a reduce task that needs more disk space than is available on any 
> mapred.local.dir in the cluster, that task will stay pending forever. For 
> example, we produced this in a QA cluster by accidentally running terasort 
> with one reducer - since no mapred.local.dir had 1T free, the job remained in 
> pending state for several days. The reason for the "stuck" task wasn't clear 
> from a user perspective until we looked at the JT logs.
> Probably better to just fail the job if a reduce task goes through all TTs 
> and finds that there isn't enough space.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to