Re: Processing 10MB files in Hadoop

CubicDesign Thu, 26 Nov 2009 07:39:43 -0800

The number of mapper is determined by your InputFormat.


In common case, if file is smaller than one block size (which is 64M by
default), one mapper for this file. if file is larger than one block size,
hadoop will split this large file, and the number of mapper for this file
will be ceiling ( (size of file)/(size of block) )

Hi

Do you mean, I should set the number of map tasks to 1 ????

I want to process this file not in a single node but over the entirecluster. I need a lot of processing power in order to finish the job inhours instead of days.

Re: Processing 10MB files in Hadoop

Reply via email to