You will need to use an alternative scheduler for this. Look at minMaps/maxMaps/etc. properties in FairScheduler at http://hadoop.apache.org/docs/stable/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29 Alternatively, look at resource-based scheduling in CapacityScheduler at http://hadoop.apache.org/docs/stable/capacity_scheduler.html#Resource+based+scheduling
P.s. Do not use general@ list for user level queries. The right list is user@hadoop.apache.org. On Fri, Jan 18, 2013 at 3:52 PM, hwang <joe.haiw...@gmail.com> wrote: > Hi all: > > My hadoop version is 1.0.2. Now I want at most 10 map tasks running at the > same time. I have found 2 parameter related to this question. > > a) mapred.job.map.capacity > > but in my hadoop version, this parameter seems abandoned. > > b) mapred.jobtracker.taskScheduler.maxRunningTasksPerJob ( > > http://grepcode.com/file/repo1.maven.org/maven2/com.ning/metrics.collector/1.0.2/mapred-default.xml > ) > > I set this variable like below: > > Configuration conf = new Configuration(); > conf.set("date", date); > conf.set("mapred.job.queue.name", "hadoop"); > conf.set("mapred.jobtracker.taskScheduler.maxRunningTasksPerJob", "10"); > > DistributedCache.createSymlink(conf); > Job job = new Job(conf, "ConstructApkDownload_" + date); > ... > > The problem is that it doesn't work. There is still more than 50 maps > running as the job starts. > > I'm not sure whether I set this parameter in wrong way ? or misunderstand > it. > > After looking through the hadoop document, I can't find another parameter > to limit the concurrent running map tasks. > > Hope someone can help me ,Thanks. > -- Harsh J