Re: How to mapreduce in the scenario

2012-05-30 Thread samir das mohapatra
From: liuzhg [liu...@cernet.com] Sent: Tuesday, May 29, 2012 3:45 PM To: common-user@hadoop.apache.org Subject: How to mapreduce in the scenario Hi, I wonder that if Hadoop can solve effectively the question as following: == input file

RE: How to mapreduce in the scenario

2012-05-30 Thread Wilson Wayne - wwilso
mohapatra [mailto:samir.help...@gmail.com] Sent: Wednesday, May 30, 2012 8:33 AM To: common-user@hadoop.apache.org Subject: Re: How to mapreduce in the scenario Yes . Hadoop Is only for Huge Dataset Computaion . May not good for small dataset. On Wed, May 30, 2012 at 6:53 AM, liuzhg liu

How to mapreduce in the scenario

2012-05-29 Thread liuzhg
Hi, I wonder that if Hadoop can solve effectively the question as following: == input file: a.txt, b.txt result: c.txt a.txt: id1,name1,age1,... id2,name2,age2,... id3,name3,age3,... id4,name4,age4,... b.txt: id1,address1,... id2,address2,...

Re: How to mapreduce in the scenario

2012-05-29 Thread Michel Segel
Hive? Sure Assuming you mean that the id is a FK common amongst the tables... Sent from a remote device. Please excuse any typos... Mike Segel On May 29, 2012, at 5:29 AM, liuzhg liu...@cernet.com wrote: Hi, I wonder that if Hadoop can solve effectively the question as following:

Re: How to mapreduce in the scenario

2012-05-29 Thread Nitin Pawar
hive is one approach (similar to routine databases but exactly not the same) if you are looking at mapreduce program then using multipleinput formats http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.html On Tue, May 29, 2012 at 4:02 PM,

RE: How to mapreduce in the scenario

2012-05-29 Thread Devaraj k
combine those easily. Thanks Devaraj From: liuzhg [liu...@cernet.com] Sent: Tuesday, May 29, 2012 3:45 PM To: common-user@hadoop.apache.org Subject: How to mapreduce in the scenario Hi, I wonder that if Hadoop can solve effectively the question

Re: How to mapreduce in the scenario

2012-05-29 Thread Soumya Banerjee
. Thanks Devaraj From: liuzhg [liu...@cernet.com] Sent: Tuesday, May 29, 2012 3:45 PM To: common-user@hadoop.apache.org Subject: How to mapreduce in the scenario Hi, I wonder that if Hadoop can solve effectively the question as following

Re: How to mapreduce in the scenario

2012-05-29 Thread samir das mohapatra
Yes it is possible by using MultipleInputs format to multiple mapper (basically 2 different mapper) Setp: 1 MultipleInputs.addInputPath(conf, new Path(args[0]), TextInputFormat.class, *Mapper1.class*); MultipleInputs.addInputPath(conf, new Path(args[1]), TextInputFormat.class, *Mapper2.class*);

How to mapreduce in the scenario

2012-05-29 Thread lzg
Hi, I wonder that if Hadoop can solve effectively the question as following: == input file: a.txt, b.txt result: c.txt a.txt: id1,name1,age1,... id2,name2,age2,... id3,name3,age3,... id4,name4,age4,... b.txt: id1,address1,... id2,address2,...

Re: How to mapreduce in the scenario

2012-05-29 Thread Robert Evans
Yes you can do it. In pig you would write something like A = load ‘a.txt’ as (id, name, age, ...) B = load ‘b.txt’ as (id, address, ...) C = JOIN A BY id, B BY id; STORE C into ‘c.txt’ Hive can do it similarly too. Or you could write your own directly in map/redcue or using the data_join jar.

Re: How to mapreduce in the scenario

2012-05-29 Thread liuzhg
the files and you can combine those easily. Thanks Devaraj From: liuzhg [liu...@cernet.com] Sent: Tuesday, May 29, 2012 3:45 PM To: common-user@hadoop.apache.org Subject: How to mapreduce in the scenario Hi, I wonder that if Hadoop can solve

Re: How to mapreduce in the scenario

2012-05-29 Thread Nitin Pawar
the records from both the files and you can combine those easily. Thanks Devaraj From: liuzhg [liu...@cernet.com] Sent: Tuesday, May 29, 2012 3:45 PM To: common-user@hadoop.apache.org Subject: How to mapreduce in the scenario Hi