Are there any production applications that use Hama?
On Thu, Sep 3, 2009 at 7:07 PM, Edward J. Yoon edwardy...@apache.orgwrote:
Just FYI, Hama (Hadoop Matrix, http://incubator.apache.org/hama) also
consider adopting this computing model based on bulk synchronous
parallel.
On Fri, Sep 4,
Hello!
Running a simple MR job, and setting a replication factor of 2. Now,
after its execution, the output is split in files named as part-0 and so
on. I want to ask is, can't we avoid these keys or key values to get printed
in output files? I mean, I am getting the output in the
Hi Sugandha ,
If you only want to the value, you need to set the key as NullWritable in
reduce.
e.g.
output.collect(NullWritable.get(), value);
On Fri, Sep 4, 2009 at 12:46 AM, Sugandha Naolekar
sugandha@gmail.comwrote:
Hello!
Running a simple MR job, and setting a replication
Or you can output the data in the keys and NullWritable as the value.
That ways you'll get only unique data...
On 9/4/09, zhang jianfeng zjf...@gmail.com wrote:
Hi Sugandha ,
If you only want to the value, you need to set the key as NullWritable in
reduce.
e.g.
Before setting the task limits, do take into account the memory considerations
( many archive posts on this can be found ).
Also, your tasktracker and datanode daemons will run on that machine as well,
so you might want to set aside some processing power for that.
Cheers!
Amogh
-Original
On Sep 3, 2009, at 11:53 PM, Ramiya V wrote:
Hi,
Thanks Amandeep and Ashish!
@Ashish: I have set the hive.metastore.warehouse.dir parameter as /
home/hive/warehouse. This warehouse directory is on the local
filesystem. So will the tables now get stored on the local
filesystem or HDFS? I
Have a look at jobclient, it should suffice.
Cheers!
Amogh
-Original Message-
From: bharath vissapragada [mailto:bharathvissapragada1...@gmail.com]
Sent: Friday, September 04, 2009 9:15 PM
To: common-user@hadoop.apache.org
Subject: Re: Some issues!
Hey ,
I have one more doubt ,
Hi Ramya,
Yes you have to explicitly give the hdfs path, so
Hdfs://namnode:port/home/hive/warehouse
in case you want to keep the same path in hdfs should work.
Ashish
-Original Message-
From: Brian Bockelman [mailto:bbock...@cse.unl.edu]
Sent: Friday, September 04, 2009 5:43 AM
To:
Dear All,
I am using Hadoop 0.20.0. I have an application that needs to run map-reduce
functions iteratively. Right now, the way I am doing this is new a Job for
each pass of the map-reduce. That seems cost a lot. Is there any way to run
map-reduce functions iteratively in one Job?
Thanks a lot
Wait.. Why are you using the same mapper and reducer and calling it 10
times? Is the output of the first iteration being input into the second one?
What are these jobs doing? Tell a bit more about that. There might be a way
by which you can club some jobs together into one job and reduce the
You can create different mapper and reducer classes and create separate job
configs for them. You can pass these different configs to the Tool object in
the same parent class... But they will essentially be different jobs being
called together from inside the same java parent class.
Why do you
Dear Amandeep,
Thanks for the fast reply. I will try the method you mentioned.
In my understanding, when a job is submitted, there will be a separate java
process in jobtracker responsible for that job. And there will be an
initialization and cleanup cost for each job. If every iteration is a
Yes, the output of the first iteration is the input of the second iteration.
Actually, I am trying the page ranking problem. In the algorithm, you have
to run several iterations each using the output of previous iteration as
input and producing the output for latter.
It is not a real life
OK. Thank you very much! Helps me a lot, I will try it.
Boyu
On Fri, Sep 4, 2009 at 3:25 PM, Amandeep Khurana ama...@gmail.com wrote:
Ah ok.. Then I think you'll have to fire separate jobs. But they can all be
fired from inside one parent job - the method I explained earlier. Try that
out...
Hi all,
What is the best way to copy directories from HDFS to local disk in
0.19.1?
Thanks,
Kris.
Amogh , thanks for yout reply.
I will make my question more clear ,
Suppose I have an array and it got updated in the MRjob1 . and i want to
access it in MRjob2 . This is what i intended in my previous question . I
have gone through the JobConf class , but i haven't found anything useful .
If
You mean programmatically or command line ?
Command line :
bin/hadoop -get /path/to/dfs/dir /path/to/local/dir
Arvind
From: Kris Jirapinyo kjirapi...@biz360.com
To: common-user common-user@hadoop.apache.org
Sent: Friday, September 4, 2009 5:15:00 PM
Hi Arvind,
You miss the fs
The command should be:
bin/hadoop fs -get /path/to/dfs/dir /path/to/local/dir
or
bin/hadoop fs -copyToLocal /path/to/dfs/dir /path/to/local/dir
The is the link of shell command for your reference.
http://hadoop.apache.org/common/docs/r0.20.0/hdfs_shell.html
On
18 matches
Mail list logo