Mark,
There is a setup price when using Hadoop, for each task a new JVM must
be spawned. On such a small scale, you won't see any good using MR.
J-D
On Mon, Apr 20, 2009 at 12:26 AM, Mark Kerzner markkerz...@gmail.com wrote:
Hi,
I ran a Hadoop MapReduce task in the local mode, reading and
Jean-Daniel,
I realize that, and my question was, is this the normal setup/finishup time,
about 2 minutes? If it is, then fine. I would expect that on tasks taking
10-15 minutes, 2 minutes would be totally justified, and I think that this
is the guideline - each task should take minutes.
Thank
Mark,
Oh sorry, yes you should expect that kind of delay. A tip to optimize
that on big jobs with lots of tasks is to use the
JobConf.setNumTasksToExecutePerJvm(int numTasks) which sets how many
times a JVM can be reused (instead of spawning new ones).
Happy Hadooping!
J-D
On Mon, Apr 20, 2009
On Apr 20, 2009, at 9:56 AM, Mark Kerzner wrote:
Hi,
I ran a Hadoop MapReduce task in the local mode, reading and writing
from
HDFS, and it took 2.5 minutes. Essentially the same operations on
the local
file system without MapReduce took 1/2 minute. Is this to be
expected?
Hmm...
Arun, thank you very much for the answer. I will turn off the combiner. I am
debugging intermediate MR steps now, so I am mostly interested in
performance to for this, and real tuning will be later, in a cluster. I am
running 18.3, but general pointers should be good enough at this stage.
I am
Hi,
I ran a Hadoop MapReduce task in the local mode, reading and writing from
HDFS, and it took 2.5 minutes. Essentially the same operations on the local
file system without MapReduce took 1/2 minute. Is this to be expected?
It seemed that the system lost most of the time in the MapReduce
@hadoop.apache.org
Subject: RE: Hadoop - is it good for me and performance question
Not sure if this will answer your question, but a similar thread
regarding hadoop performance:
http://www.mail-archive.com/core-user@hadoop.apache.org/msg02878.html
Hadoop is good for log processing if you have a lot
@hadoop.apache.org
Subject: RE: Hadoop - is it good for me and performance question
Thanks for your reply Haijun,
Do you know what makes Hadoop run so slow? I have been trying to figure
it out my self but I can't imagine anything so complicate that justifies
hadoop performance and latency.
-Original
http://www.mail-archive.com/core-user@hadoop.apache.org/msg02906.html
-Original Message-
From: yair gotdanker [mailto:[EMAIL PROTECTED]
Sent: Sunday, June 29, 2008 4:46 AM
To: core-user@hadoop.apache.org
Subject: Hadoop - is it good for me and performance question
Hello all,
I am
Hello all,
I am newbie to hadoop, The technology seems very interesting but I am not
sure it suit my needs. I really appreciate your feedbacks.
The problem:
I have multiple logservers each receiving 10-100 mg/minute. The received
data is processed to produce aggregated data.
The data
10 matches
Mail list logo