On 9/11/08 2:39 AM, "Alex Loddengaard" <[EMAIL PROTECTED]> wrote:
> I've never dealt with a large cluster, though I'd imagine it is managed the
> same way as small clusters:

    Maybe. :)

> -Use hostnames or ips, whichever is more convenient for you

    Use hostnames.  Seriously.  Who are you people using raw IPs for things?
:)  Besides, you're going to need it for the eventual support of Kerberos.

> -All the slaves need to go into the slave file

    We only put this file on the namenode and 2ndary namenode to prevent
accidents.

> -You can update software by using bin/hadoop-daemons.sh.  Something like:
> #bin/hadoop-daemons.sh "rsync (mastersrcpath) (localdestpath)"

    We don't use that because it doesn't take inconsideration down nodes
(and you *will* have down nodes!) or deal with nodes that are outside the
grid (such as our gateways/bastion hosts, data loading machines, etc).

    Instead, use a real system configuration management package such as
bcfg2, smartfrog, puppet, cfengine, etc.  [Steve, you owe me for the plug.
:) ]

> I created a wiki page that currently contains one tip for managing large
> clusters.  Could others add to this wiki page?
> 
> <http://wiki.apache.org/hadoop/LargeClusterTips>

    Quite a bit of what we do is covered in the latter half of
http://tinyurl.com/5foamm .  This is a presentation I did at ApacheCon EU
this past April that included some of the behind-the-scenes of the large
clusters at Y!.  At some point I'll probably do an updated version that
includes more adminy things (such as why we push four different types of
Hadoop configurations per grid) while others talk about core Hadoop stuff.

Reply via email to