Re: Processing 10MB files in Hadoop

Jeff Zhang Thu, 26 Nov 2009 07:41:48 -0800

Actually, you do not need to set the number of map task, the InputFormat
will compute it for you according your input data set.


Jeff Zhang


On Thu, Nov 26, 2009 at 7:39 AM, CubicDesign <cubicdes...@gmail.com> wrote:

>
>  The number of mapper is determined by your InputFormat.
>>
>> In common case, if file is smaller than one block size (which is 64M by
>> default), one mapper for this file. if file is larger than one block size,
>> hadoop will split this large file, and the number of mapper for this file
>> will be ceiling ( (size of file)/(size of block) )
>>
>>
>>
> Hi
>
> Do you mean, I should set the number of map tasks to 1 ????
> I want to process this file not in a single node but over the entire
> cluster. I need a lot of processing power in order to finish the job in
> hours instead of days.
>

Re: Processing 10MB files in Hadoop

Reply via email to