Re: question about reduce method

Akira AJISAKA Mon, 17 Feb 2014 10:15:00 -0800

Moving to u...@hadoop.apache.org.

If you have a question about this, please reply to
user mailing list instead of mapreduce-dev@.


Thanks,
Akira

(2014/02/17 10:06), Akira AJISAKA wrote:
>> I know map method put these text file into map,like follows,right?
>> <001, 35.99>
>> <001, 35.99>
>> <002, 12.49>
>> <004, 13.42>
>> <003, 499.99>
>> <001 ,78.95>
>> <002, 21.99>
>> <002, 93.45>
>> <001, 9.99>
>> <001, John Allen>
>> <002, Abigail Smith>
>> <003, April Stevens>
>> <004, Nasser Hafez>
> 
> Followings outputs are the correct.
> 
> <001,sales    35.99>
> <002,sales    12.49>
> <004,sales    13.42>
> <003,sales    499.99>
> <001,sales    78.95>
> <002,sales    21.99>
> <002,sales    93.45>
> <001,sales    9.99>
> <001,accounts John Allen>
> <002,accounts Abigail Smith>
> <003,accounts April Stevens>
> <004,accounts Nasser Hafez>
> 
> The outputs are grouped and sorted by keys, and reducers process each
> groups. The inputs of the reduce method are as follows:
> 
> <key: 001,
>   values: {sales 35.99, sales 78.95, sales 9.99, accounts John Allen}>
> <key: 002,
>   values: {sales 12.49, sales 21.99, sales 93.45, accounts Abigail Smith}>
> <key: 003,
>   values: {sales 499.99, accounts April Stevens}>
> <key: 004,
>   values: {sales 13.42, accounts Nasser Hafez}>
> 
> Regards,
> Akira
> 
> (2014/02/17 1:14), EdwardKing wrote:
>> Hello every,
>>      I am a newbie to hadoop2.2.0, I puzzle with reduce method ,I have two 
>> text file,sales.txt and account.txt,like follows:
>> sales.txt
>> 001 35.99 2012-03-15
>> 002 12.49 2004-07-02
>> 004 13.42 2005-12-20
>> 003 499.99 2010-12-20
>> 001 78.95 2012-04-02
>> 002 21.99 2006-11-30
>> 002 93.45 2008-09-10
>> 001 9.99 2012-05-17
>>
>> account.txt
>> 001 John Allen Standard 2012-03-15
>> 002 Abigail Smith Premium 2004-07-13
>> 003 April Stevens Standard 2010-12-20
>> 004 Nasser Hafez Premium 2001-04-23
>>
>> ReduceJoin.java is follows:
>> import java.io.* ;
>> import org.apache.hadoop.conf.Configuration;
>> import org.apache.hadoop.fs.Path;
>> import org.apache.hadoop.io.Text;
>> import org.apache.hadoop.io.Text;
>> import org.apache.hadoop.mapreduce.Job;
>> import org.apache.hadoop.mapreduce.Mapper;
>> import org.apache.hadoop.mapreduce.Reducer;
>> import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
>> import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
>> import org.apache.hadoop.mapreduce.lib.input.MultipleInputs ;
>> import org.apache.hadoop.mapreduce.lib.input.TextInputFormat ;
>>
>> public class ReduceJoin
>> {
>>       
>>       public static class SalesRecordMapper
>>       extends Mapper<Object, Text, Text, Text>{
>>           
>>           public void map(Object key, Text value, Context context
>>           ) throws IOException, InterruptedException
>>           {
>>               String record = value.toString() ;
>>               String[] parts = record.split("\t") ;
>>               
>>               context.write(new Text(parts[0]), new 
>> Text("sales\t"+parts[1])) ;
>>           }
>>       }
>>       
>>       public static class AccountRecordMapper
>>       extends Mapper<Object, Text, Text, Text>{
>>           
>>           public void map(Object key, Text value, Context context
>>           ) throws IOException, InterruptedException
>>           {
>>               String record = value.toString() ;
>>               String[] parts = record.split("\t") ;
>>               
>>               context.write(new Text(parts[0]), new 
>> Text("accounts\t"+parts[1])) ;
>>           }
>>       }
>>       
>>       public static class ReduceJoinReducer
>>       extends Reducer<Text, Text, Text, Text>
>>       {
>>           
>>           public void reduce(Text key, Iterable<Text> values,
>>               Context context
>>               ) throws IOException, InterruptedException
>>               {
>>                   String name = "" ;
>>               double total = 0.0 ;
>>               int count = 0 ;
>>               
>>               for(Text t: values)
>>               {
>>                   String parts[] = t.toString().split("\t") ;
>>                   
>>                   if (parts[0].equals("sales"))
>>                   {
>>                       count++ ;
>>                       total+= Float.parseFloat(parts[1]) ;
>>                   }
>>                   else if (parts[0].equals("accounts"))
>>                   {
>>                       name = parts[1] ;
>>                   }
>>               }
>>               
>>               String str = String.format("%d\t%f", count, total) ;
>>               context.write(new Text(name), new Text(str)) ;
>>           }
>>       }
>>       
>>       public static void main(String[] args) throws Exception {
>>           Configuration conf = new Configuration();
>>           Job job = new Job(conf, "Reduce-side join");
>>           job.setJarByClass(ReduceJoin.class);
>>           job.setReducerClass(ReduceJoinReducer.class);
>>           job.setOutputKeyClass(Text.class);
>>           job.setOutputValueClass(Text.class);
>>           MultipleInputs.addInputPath(job, new Path(args[0]), 
>> TextInputFormat.class, SalesRecordMapper.class) ;
>>           MultipleInputs.addInputPath(job, new Path(args[1]), 
>> TextInputFormat.class, AccountRecordMapper.class) ;
>>           //        FileOutputFormat.setOutputPath(job, new Path(args[2]));
>>           Path outputPath = new Path(args[2]);
>>           FileOutputFormat.setOutputPath(job, outputPath);
>>           outputPath.getFileSystem(conf).delete(outputPath);
>>           
>>           System.exit(job.waitForCompletion(true) ? 0 : 1);
>>       }
>> }
>>
>> I create join.jar and run it
>> $ hadoop jar join.jarReduceJoin sales accounts outputs
>> $ hadoop fs -cat /user/garry/outputs/part-r-00000
>> John Allen 3 124.929998
>> Abigail Smith 3 127.929996
>> April Stevens 1 499.989990
>> Nasser Hafez 1 13.420000
>>
>> I know map method put these text file into map,like follows,right?
>> <001, 35.99>
>> <001, 35.99>
>> <002, 12.49>
>> <004, 13.42>
>> <003, 499.99>
>> <001 ,78.95>
>> <002, 21.99>
>> <002, 93.45>
>> <001, 9.99>
>> <001, John Allen>
>> <002, Abigail Smith>
>> <003, April Stevens>
>> <004, Nasser Hafez>
>>
>> But I don't under stand reduce method,how it produce following result,any 
>> one counld give the detail steps to produce following result?  Thanks in 
>> advance
>> John Allen 3 124.929998
>> Abigail Smith 3 127.929996
>> April Stevens 1 499.989990
>> Nasser Hafez 1 13.420000
>>
>>
>>
>> ---------------------------------------------------------------------------------------------------
>> Confidentiality Notice: The information contained in this e-mail and any 
>> accompanying attachment(s)
>> is intended only for the use of the intended recipient and may be 
>> confidential and/or privileged of
>> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader 
>> of this communication is
>> not the intended recipient, unauthorized use, forwarding, printing,  
>> storing, disclosure or copying
>> is strictly prohibited, and may be unlawful.If you have received this 
>> communication in error,please
>> immediately notify the sender by return e-mail, and delete the original 
>> message and all copies from
>> your system. Thank you.
>> ---------------------------------------------------------------------------------------------------
>>
>

Re: question about reduce method

Reply via email to