Mullangi created HADOOP-9764: -------------------------------- Summary: MapReduce output issue Key: HADOOP-9764 URL: https://issues.apache.org/jira/browse/HADOOP-9764 Project: Hadoop Common Issue Type: Bug Affects Versions: 1.0.3 Environment: ubuntu Reporter: Mullangi
Hi, I am new to Hadoop concepts. While practicing with one custom MapReduce program, I found the result is not as expected after executing the code on HDFS based file. Please note that when I execute the same program using Unix based file,getting expected result. Below are the details of my code. MapReduce in java ================== import java.io.IOException; import java.util.*; import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapred.*; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.util.*; public class WordCount1 { public static class Map extends MapReduceBase implements Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException { String line = value.toString(); String tokenedZone=null; StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { tokenedZone=tokenizer.nextToken(); word.set(tokenedZone); output.collect(word, one); } } } public static class Reduce extends MapReduceBase implements Reducer { public void reduce(Text key, Iterator values, OutputCollector output, Reporter reporter) throws IOException { int sum = 0; int val = 0; while (values.hasNext()) { val = values.next().get(); sum += val; } if(sum>1) output.collect(key, new IntWritable(sum)); } } public static void main(String[] args) throws Exception { JobConf conf = new JobConf(); conf.setJarByClass(WordCount1.class); conf.setJobName("wordcount1"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); Path inPath = new Path(args[0]); Path outPath = new Path(args[0]); FileInputFormat.setInputPaths(conf,inPath ); FileOutputFormat.setOutputPath(conf, outPath); JobClient.runJob(conf); } } input File =========== test my program during test and my hadoop your during get program hadoop generated output file on HDFS file system ======================================= during 2 my 2 test 2 hadoop generated output file on local file system ======================================= during 2 my 2 program 2 test 2 Please help me on this issue -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira