Is your data size 100-200MB *total*? If so, then this is the expected behavior for MultiFileInputFormat. As Bejoy says, you can switch to TextInputFormat to get one mapper per block (min one mapper per file).
-Joey On Thu, Feb 16, 2012 at 11:03 AM, Thamizhannal Paramasivam < thamizhanna...@gmail.com> wrote: > Here are the input format for mapper. > Input Format: MultiFileInputFormat > MapperOutputKey : Text > MapperOutputValue: CustomWritable > > I shall not be in the position to upgrade hadoop-0.19.2 for some reason. > > I have checked in number of mapper on job-tracker. > > Thanks, > Thamizh > > > On Thu, Feb 16, 2012 at 6:56 PM, Joey Echeverria <j...@cloudera.com>wrote: > >> Hi Tamil, >> >> I'd recommend upgrading to a newer release as 0.19.2 is very old. As for >> your question, most input formats should set the number mappers correctly. >> What input format are you using? Where did you see the number of tasks it >> assigned to the job? >> >> -Joey >> >> >> On Thu, Feb 16, 2012 at 1:40 AM, Thamizhannal Paramasivam < >> thamizhanna...@gmail.com> wrote: >> >>> Hi All, >>> I am using hadoop-0.19.2 and running a Mapper only Job on cluster. It's >>> input path has >1000 files of 100-200MB. Since, it is Mapper only job, I >>> gave number Of reducer=0. So, it is using 2 mapper to run all the input >>> files. If we did not state the number of mapper, would n't it pick the 1 >>> mapper per input file? Or Does the default won't it pick a fair num of >>> mapper according to number input file? >>> Thanks, >>> tamil >> >> >> >> >> -- >> Joseph Echeverria >> Cloudera, Inc. >> 443.305.9434 >> >> > -- Joseph Echeverria Cloudera, Inc. 443.305.9434