from:"Soumya Banerjee"

unsubscribe

2023-08-21 Thread Soumya Banerjee

Unsubscribe

2022-04-08 Thread Soumya Banerjee

Unsubscribe

how are you?

2013-07-21 Thread Soumya Banerjee

 http://bayviewelc.co.nz/awuvhx/ngneirluavatltnigirttua






















 Soumya Banerjee














 7/21/2013 1:40:15 PM

Re: How to mapreduce in the scenario

2012-05-29 Thread Soumya Banerjee

Hi,

You can also try to use the Hadoop Reduce Side Join functionality.
Look into the contrib/datajoin/hadoop-datajoin-*.jar for the base MAP and
Reduce classes to do the same.

Regards,
Soumya.

On Tue, May 29, 2012 at 4:10 PM, Devaraj k devara...@huawei.com wrote:

 Hi Gump,

   Mapreduce fits well for solving these types(joins) of problem.

 I hope this will help you to solve the described problem..

 1. Mapoutput key and value classes : Write a map out put key
 class(Text.class), value class(CombinedValue.class). Here value class
 should be able to hold the values from both the files(a.txt and b.txt) as
 shown below.

 class CombinedValue implements WritableComparator
 {
   String name;
   int age;
   String address;
   boolean isLeft; // flag to identify from which file
 }

 2. Mapper : Write a map() function which can parse from both the
 files(a.txt, b.txt) and produces common output key and value class.

 3. Partitioner : Write the partitioner in such a way that it will Send all
 the (key, value) pairs to same reducer which are having same key.

 4. Reducer : In the reduce() function, you will receive the records from
 both the files and you can combine those easily.


 Thanks
 Devaraj


 
 From: liuzhg [liu...@cernet.com]
 Sent: Tuesday, May 29, 2012 3:45 PM
 To: common-user@hadoop.apache.org
 Subject: How to mapreduce in the scenario

 Hi,

 I wonder that if Hadoop can solve effectively the question as following:

 ==
 input file: a.txt, b.txt
 result: c.txt

 a.txt:
 id1,name1,age1,...
 id2,name2,age2,...
 id3,name3,age3,...
 id4,name4,age4,...

 b.txt：
 id1,address1,...
 id2,address2,...
 id3,address3,...

 c.txt
 id1,name1,age1,address1,...
 id2,name2,age2,address2,...
 

 I know that it can be done well by database.
 But I want to handle it with hadoop if possible.
 Can hadoop meet the requirement?

 Any suggestion can help me. Thank you very much!

 Best Regards,

 Gump

unsubscribe

Unsubscribe

how are you?

Re: How to mapreduce in the scenario

4 matches

Site Navigation

Mail list logo

Footer information