From: liuzhg [liu...@cernet.com]
Sent: Tuesday, May 29, 2012 3:45 PM
To: common-user@hadoop.apache.org
Subject: How to mapreduce in the scenario
Hi,
I wonder that if Hadoop can solve effectively the question as following:
==
input file
mohapatra [mailto:samir.help...@gmail.com]
Sent: Wednesday, May 30, 2012 8:33 AM
To: common-user@hadoop.apache.org
Subject: Re: How to mapreduce in the scenario
Yes . Hadoop Is only for Huge Dataset Computaion .
May not good for small dataset.
On Wed, May 30, 2012 at 6:53 AM, liuzhg liu
Hi,
I wonder that if Hadoop can solve effectively the question as following:
==
input file: a.txt, b.txt
result: c.txt
a.txt:
id1,name1,age1,...
id2,name2,age2,...
id3,name3,age3,...
id4,name4,age4,...
b.txt:
id1,address1,...
id2,address2,...
Hive?
Sure Assuming you mean that the id is a FK common amongst the tables...
Sent from a remote device. Please excuse any typos...
Mike Segel
On May 29, 2012, at 5:29 AM, liuzhg liu...@cernet.com wrote:
Hi,
I wonder that if Hadoop can solve effectively the question as following:
hive is one approach (similar to routine databases but exactly not the same)
if you are looking at mapreduce program then using multipleinput formats
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.html
On Tue, May 29, 2012 at 4:02 PM,
combine those easily.
Thanks
Devaraj
From: liuzhg [liu...@cernet.com]
Sent: Tuesday, May 29, 2012 3:45 PM
To: common-user@hadoop.apache.org
Subject: How to mapreduce in the scenario
Hi,
I wonder that if Hadoop can solve effectively the question
.
Thanks
Devaraj
From: liuzhg [liu...@cernet.com]
Sent: Tuesday, May 29, 2012 3:45 PM
To: common-user@hadoop.apache.org
Subject: How to mapreduce in the scenario
Hi,
I wonder that if Hadoop can solve effectively the question as following
Yes it is possible by using MultipleInputs format to multiple mapper
(basically 2 different mapper)
Setp: 1
MultipleInputs.addInputPath(conf, new Path(args[0]), TextInputFormat.class,
*Mapper1.class*);
MultipleInputs.addInputPath(conf, new Path(args[1]),
TextInputFormat.class, *Mapper2.class*);
Hi,
I wonder that if Hadoop can solve effectively the question as following:
==
input file: a.txt, b.txt
result: c.txt
a.txt:
id1,name1,age1,...
id2,name2,age2,...
id3,name3,age3,...
id4,name4,age4,...
b.txt:
id1,address1,...
id2,address2,...
Yes you can do it. In pig you would write something like
A = load ‘a.txt’ as (id, name, age, ...)
B = load ‘b.txt’ as (id, address, ...)
C = JOIN A BY id, B BY id;
STORE C into ‘c.txt’
Hive can do it similarly too. Or you could write your own directly in
map/redcue or using the data_join jar.
the files and you can combine those easily.
Thanks
Devaraj
From: liuzhg [liu...@cernet.com]
Sent: Tuesday, May 29, 2012 3:45 PM
To: common-user@hadoop.apache.org
Subject: How to mapreduce in the scenario
Hi,
I wonder that if Hadoop can solve
the records from
both the files and you can combine those easily.
Thanks
Devaraj
From: liuzhg [liu...@cernet.com]
Sent: Tuesday, May 29, 2012 3:45 PM
To: common-user@hadoop.apache.org
Subject: How to mapreduce in the scenario
Hi
12 matches
Mail list logo