On Fri, 25 May 2007, Randy Philipp wrote:

> I am currently running a cluster with several nodes that have lost their
> Myrinet cards.  I wanted to make these nodes available to run serial
> jobs without restricting the jobs to these nodes, preferentially
> scheduling the jobs to run the Myrnetless nodes first.

What we do is list the nodes that have lost their Myrinet cards (must be 
catching) in a standing reservation that only permits jobs requesting 1 CPU 
to run on them.

It's not perfect, theoretically someone could build their serial code with a 
Myrinet enabled MPICH and find they get an error when the MPI stack tries to 
do the Myrinet DMA allocations and fails because it can't get to the card, 
but we've not seen that happen.

Here is what is in our current Moab configuration, but it should work in Maui 
(caveat emptor, batteries not included, etc).

SRCFG[nomyrinet] STARTTIME=0:0:0 ENDTIME=23:59:59
SRCFG[nomyrinet] DEPTH=30
SRCFG[nomyrinet] HOSTLIST=node009,node035,node062,node084,node085
SRCFG[nomyrinet] PROCLIMIT=1
SRCFG[nomyrinet] PERIOD=INFINITY
SRCFG[nomyrinet] ACCESS=DEDICATED FLAGS=ignstate

Best of luck!
Chris
-- 
 Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
mauiusers mailing list
mauiusers@supercluster.org
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to