----- "Itay M" <[EMAIL PROTECTED]> wrote: > Hi, > Is there any guide explaining how implement checkpointing in an > OpenPBS / MAUI environment with linux as the compute nodes?
I believe that you might find posts about getting suspend/resume working in the archives (though we've never used it here, so I can't vouch for if it still works). Some codes (like NAMD) implement checkpointing themselves. In Torque's trunk at the moment is preliminary code for supporting the BCLR checkpointing kernel module, though it's likely to just be for single CPU jobs at the moment (you'll probably need one of the MPI's that supports BCLR to get further). cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency _______________________________________________ mauiusers mailing list mauiusers@supercluster.org http://www.supercluster.org/mailman/listinfo/mauiusers