[ https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805652#comment-13805652 ]
Sangjin Lee commented on MAPREDUCE-5186: ---------------------------------------- Thanks for the clarification. Then I'm +1 on restoring the MR1 behavior. > mapreduce.job.max.split.locations causes some splits created by > CombineFileInputFormat to fail > ---------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-5186 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission > Affects Versions: 2.0.4-alpha, 2.2.0 > Reporter: Sangjin Lee > Assignee: Robert Parker > Priority: Critical > Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch > > > CombineFileInputFormat can easily create splits that can come from many > different locations (during the last pass of creating "global" splits). > However, we observe that this often runs afoul of the > mapreduce.job.max.split.locations check that's done by JobSplitWriter. > The default value for mapreduce.job.max.split.locations is 10, and with any > decent size cluster, CombineFileInputFormat creates splits that are well > above this limit. -- This message was sent by Atlassian JIRA (v6.1#6144)