Actually, you do not need to set the number of map task, the InputFormat will compute it for you according your input data set.
Jeff Zhang On Thu, Nov 26, 2009 at 7:39 AM, CubicDesign <cubicdes...@gmail.com> wrote: > > The number of mapper is determined by your InputFormat. >> >> In common case, if file is smaller than one block size (which is 64M by >> default), one mapper for this file. if file is larger than one block size, >> hadoop will split this large file, and the number of mapper for this file >> will be ceiling ( (size of file)/(size of block) ) >> >> >> > Hi > > Do you mean, I should set the number of map tasks to 1 ???? > I want to process this file not in a single node but over the entire > cluster. I need a lot of processing power in order to finish the job in > hours instead of days. >