[ https://issues.apache.org/jira/browse/MAPREDUCE-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Joseph Evans resolved MAPREDUCE-2684. -------------------------------------------- Resolution: Duplicate > Job Tracker can starve reduces with very large input. > ----------------------------------------------------- > > Key: MAPREDUCE-2684 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2684 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker > Affects Versions: 0.20.204.0 > Reporter: Robert Joseph Evans > Assignee: Robert Joseph Evans > > If mapreduce.reduce.input.limit is mis-configured or if a cluster is just > running low on disk space in general then reduces with large a input may > never get scheduled causing the Job to never fail and never succeed, just > starve until the job is killed. > The JobInProgess tries to guess at the size of the input to all reducers in a > job. If the size is over mapreduce.reduce.input.limit then the job is > killed. If it is not then findNewReduceTask() checks to see if the estimated > size is too big to fit on the node currently looking for work. If it is not > then it will let some other task have a chance at the slot. > The idea is to keep track of how often it happens that a Reduce Slot is > rejected because of the lack of space vs how often it succeeds and then guess > if the reduce tasks will ever be scheduled. > So I would like some feedback on this. > 1) How should we guess. Someone who found the bug here suggested P1 + (P2 * > S), where S is the number of successful assignments. Possibly P1 = 20 and P2 > = 2.0. I am not really sure. > 2) What should we do when we guess that it will never get a slot? Should we > fail the job or do we say, even though it might fail, well lets just schedule > the it and see if it really will fail. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira