[ https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110667#comment-16110667 ]
Peter Bacsko commented on MAPREDUCE-6870: ----------------------------------------- Thanks [~haibochen], I addressed your comments. Things to check / consider: 1. Variable names (is {{preemptMappersOnReduceFinish}} good?) 2. Added a new method to {{MapTaskImpl}} with locking, which is probably not necessary but I felt it's better to have it anyway > Add configuration for MR job to finish when all reducers are complete (even > with unfinished mappers) > ---------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-6870 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Affects Versions: 2.6.1 > Reporter: Zhe Zhang > Assignee: Peter Bacsko > Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch, > MAPREDUCE-6870-003.patch > > > Even with MAPREDUCE-5817, there could still be cases where mappers get > scheduled before all reducers are complete, but those mappers run for long > time, even after all reducers are complete. This could hurt the performance > of large MR jobs. > In some cases, mappers don't have any materialize-able outcome other than > providing intermediate data to reducers. In that case, the job owner should > have the config option to finish the job once all reducers are complete. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org