My program is a basic program like this : import java.io.IOException; import java.util.*; import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapred.*; import org.apache.hadoop.util.*;
public class WordCount { public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); output.collect(word, one); } } } public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } } public static void main(String[] args) throws Exception { JobConf conf = new JobConf(WordCount.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); Job.Job.setNumReduceTasks(10); JobClient.runJob(conf); } } How to use Job.setNumReduceTasks(INT) function here.. I am not using any Job class object here. Thanks. Praveenesh On Fri, May 20, 2011 at 7:07 PM, Evert Lammerts <evert.lamme...@sara.nl>wrote: > Hi Praveenesh, > > * You can set the maximum amount of reducers per node in your > mapred-site.xml using mapred.tasktracker.reduce.tasks.maximum (default set > to 2). > * You can set the default number of reduce tasks with mapred.reduce.tasks > (default set to 1 - this causes your single reducer). > * Your job can try to override this setting by calling > Job.setNumReduceTasks(INT) ( > http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Job.html#setNumReduceTasks(int) > ). > > Cheers, > Evert > > > > -----Original Message----- > > From: modemide [mailto:modem...@gmail.com] > > Sent: vrijdag 20 mei 2011 15:26 > > To: common-user@hadoop.apache.org > > Subject: Re: Why Only 1 Reducer is running ?? > > > > what does your mapred-site.xml file say? > > > > I've used wordcount and had close to 12 reduces running on a 6 > > datanode cluster on a 3 GB file. > > > > > > I have a configuration in there which says: > > mapred.reduce.tasks = 12 > > > > The reason I chose 12 was because it was recommended that I choose 2x > > number of tasktrackers. > > > > > > > > > > > > On 5/20/11, praveenesh kumar <praveen...@gmail.com> wrote: > > > Hello everyone, > > > > > > I am using wordcount application to test on my hadoop cluster of 5 > > nodes. > > > The file size is around 5 GB. > > > Its taking around 2 min - 40 sec for execution. > > > But when I am checking the JobTracker web portal, I am seeing only > > one > > > reducer is running. Why so ?? > > > How can I change the code so that I will run multiple reducers also > > ?? > > > > > > Thanks, > > > Praveenesh > > > >