On Feb 12, 2008, at 7:08 AM, Marco Nicosia wrote:
DFS should place one replica per rack:
http://issues.apache.org/jira/browse/HADOOP-2559
No, that would hurt the aggregate write throughput. Read the comment
on 2559:
http://issues.apache.org/jira/browse/HADOOP-2559?
focusedCommentId=1256712
DFS should place one replica per rack:
http://issues.apache.org/jira/browse/HADOOP-2559
On 2/9/08 22:53, "Owen O'Malley" <[EMAIL PROTECTED]> wrote:
>
> On Feb 8, 2008, at 9:32 AM, Jeff Eastman wrote:
>
>> I noticed that phenomena right off the bat. Is that a designed
>> "feature"
>> or just an
On Feb 8, 2008, at 9:32 AM, Jeff Eastman wrote:
I noticed that phenomena right off the bat. Is that a designed
"feature"
or just an unhappy consequence of how blocks are allocated?
It was driven by a desire to maximize HDFS write throughput, which
has unfortunate effects in the case of a
On 2/8/08 9:32 AM, "Jeff Eastman" <[EMAIL PROTECTED]> wrote:
> I noticed that phenomena right off the bat. Is that a designed "feature"
> or just an unhappy consequence of how blocks are allocated?
My understanding is that this is by design--when you are running a MR
job, you want the output,
--
From: Allen Wittenauer [mailto:[EMAIL PROTECTED]
Sent: Friday, February 08, 2008 9:15 AM
To: core-user@hadoop.apache.org
Subject: Re: Starting up a larger cluster
On 2/7/08 11:01 PM, "Tim Wintle" <[EMAIL PROTECTED]> wrote:
> it's
> useful to be able to connect from no
On 2/7/08 11:01 PM, "Tim Wintle" <[EMAIL PROTECTED]> wrote:
> it's
> useful to be able to connect from nodes that aren't in the slaves file
> so that you can put in input data direct from another machine that's not
> part of the cluster,
I'd actually recommend this as a best practice. We've
You can set which nodes are allowed to connect in hadoop-site.xml - it's
useful to be able to connect from nodes that aren't in the slaves file
so that you can put in input data direct from another machine that's not
part of the cluster, or add extra machines on the fly (just make sure
they're rout
Oops, should be TaskTracker.
-Original Message-
From: Jeff Eastman [mailto:[EMAIL PROTECTED]
Sent: Thursday, February 07, 2008 12:24 PM
To: core-user@hadoop.apache.org
Subject: RE: Starting up a larger cluster
Hi Ben,
I've been down this same path recently and I think I understand
Hi Ben,
I've been down this same path recently and I think I understand your
issues:
1) Yes, you need the hadoop folder to be in the same location on each
node. Only the master node actually uses the slaves file, to start up
DataNode and JobTracker daemons on those nodes.
2) If you did not specif