Indeed you cannot explicitly set the number of mappers, but still you can gain some control over it, by setting mapred.max.split.size, or mapred.min.split.size.
For example, if you have a file of 10GB (10737418240 B), you would like 10 mappers, then each mapper has to deal with 1GB data. According to "splitsize = max(minimumSize, min(maximumSize, blockSize))", you can set mapred.min.split.size=1073741824 (1GB), i.e. $hadoop jar -Dmapred.min.split.size=1073741824 yourjar yourargs It is well explained in thread: http://stackoverflow.com/questions/9678180/change-file-split-size-in-hadoop. Regards, Zheyi. On Fri, Mar 15, 2013 at 8:49 AM, YouPeng Yang <yypvsxf19870...@gmail.com>wrote: > s