On May 5, 2009, at 4:47 AM, Christian Ulrik Søttrup wrote:

Hi all,

I have a job that creates very big local files so i need to split it to as many mappers as possible. Now the DFS block size I'm using means that this job is only split to 3 mappers. I don't want to change the hdfs wide block size because it works for my other jobs.


I would rather keep the big files on HDFS and use - Dmapred.min.split.size to get more maps to process your data....

http://hadoop.apache.org/core/docs/r0.20.0/mapred_tutorial.html#Job+Input

Arun

Reply via email to