[ https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792828#comment-13792828 ]
Sangjin Lee commented on MAPREDUCE-5186: ---------------------------------------- Raising the priority. The default value of mapreduce.job.max.split.locations effectively renders CombineFileInputFormat DOA on any decent sized clusters. Have others encountered this issue? > mapreduce.job.max.split.locations causes some splits created by > CombineFileInputFormat to fail > ---------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-5186 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 > Affects Versions: 2.0.4-alpha > Reporter: Sangjin Lee > Priority: Critical > > CombineFileInputFormat can easily create splits that can come from many > different locations (during the last pass of creating "global" splits). > However, we observe that this often runs afoul of the > mapreduce.job.max.split.locations check that's done by JobSplitWriter. > The default value for mapreduce.job.max.split.locations is 10, and with any > decent size cluster, CombineFileInputFormat creates splits that are well > above this limit. -- This message was sent by Atlassian JIRA (v6.1#6144)