On Sun, Nov 20, 2005 at 08:51:13PM -0500, St?phane Lacasse wrote:
[snip discussion about installing]

I've done the cluster system (128 node+ 1 master) in a similar fashion
to what you are after.
1. PXE-boot install environment for performing installs of both the
master and all of the nodes.
2. The install environment uses the Gentoo Installer, with the CLI
frontend I wrote for the GLI project, and performs complete installs of
nodes in under 20 minutes (depending on network traffic).

By using GLI, it's a simple matter of altering the install profiles to
reconfigure the cluster, and wipe the nodes for changing their purpose
(presently we have an MPI mode and a MOSIX mode), some of the cluster
users need assurances that none of their data remains on the cluster
after they are done, hence being able to reinstall easily.

For regular system operation, we specifically left out boot loaders on
all machines, as we've hit cases where the MBR is in a state that just
hangs the machine instead of going to PXE. By enforcing always PXE, and
controlling how it boots via PXE instead, we've had much better
responses.

The above design also allows re-configuring the cluster into multiple
smaller clusters with physical network separation (using VLAN-capable
switches).

Also, make use of your cluster tools to administer the cluster. OpenPBS
allows running a job on all nodes, so use it to emerge -K [package].
(not -k as binpkgs don't currently have any locking in $PKGDIR, and can
get corrupted if two emerge processes try to create a binpkg at the
same time.)

-- 
Robin Hugh Johnson
E-Mail     : [EMAIL PROTECTED]
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85

Attachment: pgpytwG7dw2rS.pgp
Description: PGP signature

Reply via email to