On Fri, 25 May 2007, Randy Philipp wrote: > I am currently running a cluster with several nodes that have lost their > Myrinet cards. I wanted to make these nodes available to run serial > jobs without restricting the jobs to these nodes, preferentially > scheduling the jobs to run the Myrnetless nodes first.
What we do is list the nodes that have lost their Myrinet cards (must be catching) in a standing reservation that only permits jobs requesting 1 CPU to run on them. It's not perfect, theoretically someone could build their serial code with a Myrinet enabled MPICH and find they get an error when the MPI stack tries to do the Myrinet DMA allocations and fails because it can't get to the card, but we've not seen that happen. Here is what is in our current Moab configuration, but it should work in Maui (caveat emptor, batteries not included, etc). SRCFG[nomyrinet] STARTTIME=0:0:0 ENDTIME=23:59:59 SRCFG[nomyrinet] DEPTH=30 SRCFG[nomyrinet] HOSTLIST=node009,node035,node062,node084,node085 SRCFG[nomyrinet] PROCLIMIT=1 SRCFG[nomyrinet] PERIOD=INFINITY SRCFG[nomyrinet] ACCESS=DEDICATED FLAGS=ignstate Best of luck! Chris -- Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager Victorian Partnership for Advanced Computing http://www.vpac.org/ Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mauiusers mailing list mauiusers@supercluster.org http://www.supercluster.org/mailman/listinfo/mauiusers