Hi All, I have some very long BH simulations to run and I'd like to checkpoint for these. I haven't really done checkpointing before. But what I know is that chekpointing information can be specified in the parameter file (for use by Cactus), and also that Simfactory does seem to have some stuff to do with or handle checkointing ( "restart-id", etc...). Of course, scheduling systems (e.g. PBSPro) at HPCs would have support for checkpointing but I don't want to use that. Probably it is only best to use that to set the walltime.
So, my main question is: Assuming I set a maximum walltime of 12 hours, and I set my simulation to dump checkpoints every 3hrs (in walltime units), how do I *restart* my job at the end of the 12 hrs using Simfactory in a way that the simulation starts off from the last checkpoint it droppped before terminating? What extra command line options should I pass to the sumbit command of SImfactory? Below is a segment of my parfile where checkpointing information is given, provides as a sample or as a basis for anyone who would want to advise me on how such information should be given. ##### Checkpointing ######### CarpetIOHDF5::checkpoint = yes IO::checkpoint_ID = yes IO::recover = autoprobe IO::checkpoint_every = 1024 IO::out_proc_every = 2 IO::checkpoint_keep = 3 IO::checkpoint_dir = $parfile Carpet::regrid_during_recovery = no CarpetIOHDF5::use_grid_structure_from_checkpoint = yes CarpetIOHDF5::open_one_input_file_at_a_time = yes Your advice and assistance will be highly appreciated. Best, Dumsani _______________________________________________ Users mailing list [email protected] http://lists.einsteintoolkit.org/mailman/listinfo/users
