If your datanode has 2 HDFS-chunks (blocks) of the input file, the scheduler will first prefer to run 2 map tasks on the tasktracker where this datanode resides.
On Fri, Jul 1, 2011 at 10:33 PM, Juwei Shi <shiju...@gmail.com> wrote: > I think that Anthony is right. Task capacity has to been set at > mapred-default.html, and restart the cluster. > > Anthony Urso > > > > 2011/7/2 <praveen.pe...@nokia.com> > > Are you sure? AFAIK all mapred.xxx properties can be set via job config. I >> also read on yahoo tutorial that this property can be either set in >> hadoop-site.XML or job config. May be someone can confirm this who have >> really used this property. >> >> Praveen >> >> On Jul 1, 2011, at 4:46 PM, "ext Anthony Urso" <antho...@cs.ucla.edu> >> wrote: >> >> > On Fri, Jul 1, 2011 at 1:03 PM, <praveen.pe...@nokia.com> wrote: >> >> Hi all, >> >> >> >> I am using hadoop 0.20.2. I am setting the property >> >> mapred.tasktracker.map.tasks.maximum = 4 (same for reduce also) on my >> job >> >> conf but I am still seeing max of only 2 map and reduce tasks on each >> node. >> >> I know my machine can run 4 maps and 4 reduce tasks in parallel. Is >> this a >> >> bug in 0.20.2 or am I doing something wrong? >> >> >> >> >> > >> > If I remember correctly, you have to set this in your hadoop-site.xml >> > and restart your job tracker and task trackers. >> > >> >> >> >> Configuration conf = new Configuration(); >> >> >> >> conf.set("mapred.tasktracker.map.tasks.maximum", "4"); >> >> >> >> conf.set("mapred.tasktracker.reduce.tasks.maximum", "4"); >> >> >> >> >> >> >> >> Thanks >> >> >> >> Praveen >> > > > > -- > - Juwei > -- Best Regards, Mostafa Ead