[ https://issues.apache.org/jira/browse/MAPREDUCE-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125843#comment-13125843 ]
Harsh J commented on MAPREDUCE-2905: ------------------------------------ Jeff, I'll leave the final review to people better suited to reviewing FairScheduler patches, but am gonna post some notes on getting this patch to an acceptable state: A few nits, hence: - Patch is mixing spaces and tabs. Follow the coding guidelines and use only spaces. 2 spaces per indent instead of hard tab characters which seem present right now. - If you'd like to get this included upstream, you'll have to re-up the patch with permission grants to ASF. This is doable when you attach a file (look for an option at the bottom -- or perhaps you missed it accidentally). If possible, can we somehow have a test for this? Just asking. > CapBasedLoadManager incorrectly allows assignment when assignMultiple is true > (was: assignmultiple per job) > ----------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-2905 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2905 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/fair-share > Affects Versions: 0.20.2 > Reporter: Jeff Bean > Attachments: MR-2905.patch > > > We encountered a situation where in the same cluster, large jobs benefit from > mapred.fairscheduler.assignmultiple, but small jobs with small numbers of > mappers do not: the mappers all clump to fully occupy just a few nodes, which > causes those nodes to saturate and bottleneck. The desired behavior is to > spread the job across more nodes so that a relatively small job doesn't > saturate any node in the cluster. > Testing has shown that setting mapred.fairscheduler.assignmultiple to false > gives the desired behavior for small jobs, but is unnecessary for large jobs. > However, since this is a cluster-wide setting, we can't properly tune. > It'd be nice if jobs can set a param similar to > mapred.fairscheduler.assignmultiple on submission to better control the task > distribution of a particular job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira