(2014/02/17 10:06), Akira AJISAKA wrote:
> Followings outputs are the correct.
> <001,sales    35.99>
> <002,sales    12.49>
> <004,sales    13.42>
> <003,sales    499.99>
> <001,sales    78.95>
> <002,sales    21.99>
> <002,sales    93.45>
> <001,sales    9.99>
> <001,accounts John Allen>
> <002,accounts Abigail Smith>
> <003,accounts April Stevens>
> <004,accounts Nasser Hafez>
> The outputs are grouped and sorted by keys, and reducers process each
> groups. The inputs of the reduce method are as follows:
> <key: 001,
>   values: {sales 35.99, sales 78.95, sales 9.99, accounts John Allen}>
> <key: 002,
>   values: {sales 12.49, sales 21.99, sales 93.45, accounts Abigail Smith}>
> <key: 003,
>   values: {sales 499.99, accounts April Stevens}>
> <key: 004,
>   values: {sales 13.42, accounts Nasser Hafez}>
> Regards,
> Akira
> (2014/02/17 1:14), EdwardKing wrote:
>> Hello every,
>>      I am a newbie to hadoop2.2.0, I puzzle with reduce method ,I have two 
>> text file,sales.txt and account.txt,like follows:
>> sales.txt
>> 001 35.99 2012-03-15
>> 002 12.49 2004-07-02
>> 004 13.42 2005-12-20
>> 003 499.99 2010-12-20
>> 001 78.95 2012-04-02
>> 002 21.99 2006-11-30
>> 002 93.45 2008-09-10
>> 001 9.99 2012-05-17
>> account.txt
>> 001 John Allen Standard 2012-03-15
>> 002 Abigail Smith Premium 2004-07-13
>> 003 April Stevens Standard 2010-12-20
>> 004 Nasser Hafez Premium 2001-04-23
>> is follows:
>> import* ;
>> import org.apache.hadoop.conf.Configuration;
>> import org.apache.hadoop.fs.Path;
>> import;
>> import;
>> import org.apache.hadoop.mapreduce.Job;
>> import org.apache.hadoop.mapreduce.Mapper;
>> import org.apache.hadoop.mapreduce.Reducer;
>> import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
>> import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
>> import org.apache.hadoop.mapreduce.lib.input.MultipleInputs ;
>> import org.apache.hadoop.mapreduce.lib.input.TextInputFormat ;
>> public class ReduceJoin
>> {
>>       public static class SalesRecordMapper
>>       extends Mapper<Object, Text, Text, Text>{
>>           public void map(Object key, Text value, Context context
>>           ) throws IOException, InterruptedException
>>           {
>>               String record = value.toString() ;
>>               String[] parts = record.split("\t") ;
>>               context.write(new Text(parts[0]), new 
>> Text("sales\t"+parts[1])) ;
>>           }
>>       }
>>       public static class AccountRecordMapper
>>       extends Mapper<Object, Text, Text, Text>{
>>           public void map(Object key, Text value, Context context
>>           ) throws IOException, InterruptedException
>>           {
>>               String record = value.toString() ;
>>               String[] parts = record.split("\t") ;
>>               context.write(new Text(parts[0]), new 
>> Text("accounts\t"+parts[1])) ;
>>           }
>>       }
>>       public static class ReduceJoinReducer
>>       extends Reducer<Text, Text, Text, Text>
>>       {
>>           public void reduce(Text key, Iterable<Text> values,
>>               Context context
>>               ) throws IOException, InterruptedException
>>               {
>>                   String name = "" ;
>>               double total = 0.0 ;
>>               int count = 0 ;
>>               for(Text t: values)
>>               {
>>                   String parts[] = t.toString().split("\t") ;
>>                   if (parts[0].equals("sales"))
>>                   {
>>                       count++ ;
>>                       total+= Float.parseFloat(parts[1]) ;
>>                   }
>>                   else if (parts[0].equals("accounts"))
>>                   {
>>                       name = parts[1] ;
>>                   }
>>               }
>>               String str = String.format("%d\t%f", count, total) ;
>>               context.write(new Text(name), new Text(str)) ;
>>           }
>>       }
>>       public static void main(String[] args) throws Exception {
>>           Configuration conf = new Configuration();
>>           Job job = new Job(conf, "Reduce-side join");
>>           job.setJarByClass(ReduceJoin.class);
>>           job.setReducerClass(ReduceJoinReducer.class);
>>           job.setOutputKeyClass(Text.class);
>>           job.setOutputValueClass(Text.class);
>>           MultipleInputs.addInputPath(job, new Path(args[0]), 
>> TextInputFormat.class, SalesRecordMapper.class) ;
>>           MultipleInputs.addInputPath(job, new Path(args[1]), 
>> TextInputFormat.class, AccountRecordMapper.class) ;
>>           //        FileOutputFormat.setOutputPath(job, new Path(args[2]));
>>           Path outputPath = new Path(args[2]);
>>           FileOutputFormat.setOutputPath(job, outputPath);
>>           outputPath.getFileSystem(conf).delete(outputPath);
>>           System.exit(job.waitForCompletion(true) ? 0 : 1);
>>       }
>> }
>> I create join.jar and run it
>> $ hadoop jar join.jarReduceJoin sales accounts outputs
>> $ hadoop fs -cat /user/garry/outputs/part-r-00000
>> John Allen 3 124.929998
>> Abigail Smith 3 127.929996
>> April Stevens 1 499.989990
>> Nasser Hafez 1 13.420000
>> But I don't under stand reduce method,how it produce following result,any 
>> one counld give the detail steps to produce following result?  Thanks in 
>> advance
>> John Allen 3 124.929998
>> Abigail Smith 3 127.929996
>> April Stevens 1 499.989990
>> Nasser Hafez 1 13.420000
