unsubscribe

2023-08-21 Thread Soumya Banerjee



Unsubscribe

2022-04-08 Thread Soumya Banerjee
Unsubscribe


how are you?

2013-07-21 Thread Soumya Banerjee
 http://bayviewelc.co.nz/awuvhx/ngneirluavatltnigirttua






















 Soumya Banerjee














 7/21/2013 1:40:15 PM


Re: How to mapreduce in the scenario

2012-05-29 Thread Soumya Banerjee
Hi,

You can also try to use the Hadoop Reduce Side Join functionality.
Look into the contrib/datajoin/hadoop-datajoin-*.jar for the base MAP and
Reduce classes to do the same.

Regards,
Soumya.

On Tue, May 29, 2012 at 4:10 PM, Devaraj k devara...@huawei.com wrote:

 Hi Gump,

   Mapreduce fits well for solving these types(joins) of problem.

 I hope this will help you to solve the described problem..

 1. Mapoutput key and value classes : Write a map out put key
 class(Text.class), value class(CombinedValue.class). Here value class
 should be able to hold the values from both the files(a.txt and b.txt) as
 shown below.

 class CombinedValue implements WritableComparator
 {
   String name;
   int age;
   String address;
   boolean isLeft; // flag to identify from which file
 }

 2. Mapper : Write a map() function which can parse from both the
 files(a.txt, b.txt) and produces common output key and value class.

 3. Partitioner : Write the partitioner in such a way that it will Send all
 the (key, value) pairs to same reducer which are having same key.

 4. Reducer : In the reduce() function, you will receive the records from
 both the files and you can combine those easily.


 Thanks
 Devaraj


 
 From: liuzhg [liu...@cernet.com]
 Sent: Tuesday, May 29, 2012 3:45 PM
 To: common-user@hadoop.apache.org
 Subject: How to mapreduce in the scenario

 Hi,

 I wonder that if Hadoop can solve effectively the question as following:

 ==
 input file: a.txt, b.txt
 result: c.txt

 a.txt:
 id1,name1,age1,...
 id2,name2,age2,...
 id3,name3,age3,...
 id4,name4,age4,...

 b.txt:
 id1,address1,...
 id2,address2,...
 id3,address3,...

 c.txt
 id1,name1,age1,address1,...
 id2,name2,age2,address2,...
 

 I know that it can be done well by database.
 But I want to handle it with hadoop if possible.
 Can hadoop meet the requirement?

 Any suggestion can help me. Thank you very much!

 Best Regards,

 Gump