----- "Itay M" <[EMAIL PROTECTED]> wrote:

> Hi,
> Is there any guide explaining how implement checkpointing in an
> OpenPBS / MAUI environment with linux as the compute nodes?

I believe that you might find posts about getting
suspend/resume working in the archives (though we've never
used it here, so I can't vouch for if it still works).

Some codes (like NAMD) implement checkpointing themselves.

In Torque's trunk at the moment is preliminary code for
supporting the BCLR checkpointing kernel module, though
it's likely to just be for single CPU jobs at the
moment (you'll probably need one of the MPI's that
supports BCLR to get further).

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency
_______________________________________________
mauiusers mailing list
mauiusers@supercluster.org
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to