On Sun, Nov 20, 2005 at 08:51:13PM -0500, St?phane Lacasse wrote: [snip discussion about installing]
I've done the cluster system (128 node+ 1 master) in a similar fashion to what you are after. 1. PXE-boot install environment for performing installs of both the master and all of the nodes. 2. The install environment uses the Gentoo Installer, with the CLI frontend I wrote for the GLI project, and performs complete installs of nodes in under 20 minutes (depending on network traffic). By using GLI, it's a simple matter of altering the install profiles to reconfigure the cluster, and wipe the nodes for changing their purpose (presently we have an MPI mode and a MOSIX mode), some of the cluster users need assurances that none of their data remains on the cluster after they are done, hence being able to reinstall easily. For regular system operation, we specifically left out boot loaders on all machines, as we've hit cases where the MBR is in a state that just hangs the machine instead of going to PXE. By enforcing always PXE, and controlling how it boots via PXE instead, we've had much better responses. The above design also allows re-configuring the cluster into multiple smaller clusters with physical network separation (using VLAN-capable switches). Also, make use of your cluster tools to administer the cluster. OpenPBS allows running a job on all nodes, so use it to emerge -K [package]. (not -k as binpkgs don't currently have any locking in $PKGDIR, and can get corrupted if two emerge processes try to create a binpkg at the same time.) -- Robin Hugh Johnson E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
pgpytwG7dw2rS.pgp
Description: PGP signature
