Indeed, that’s what I originally wanted to do (after reading your Checkpointing Howto): Providing a base directory like /home/checkpoint where all users can create subdirectories (e.g /home/checkpoint/$JOB_ID/) for storing the checkpoint file of their jobs. But, what if two users specify a checkpoint file with the same name and try to store it directly in the base directory (e.g /home/checkpoint/ckpt-file). The second user could not (and also should not) write to the file because of lacking permissions, therefore his job won’t be able to store the checkpoints. This means that a user must either choose a unique name for a subdirectory (like $JOB_ID), or a unique name for the checkpoint file (if stored directly in the base directory). Because of those responsibilities on the part of the user, I thought, why not just leave it up to the user where to store the checkpoint file within his home directory.
Best, Nico On 08.01.16 17:45, "Reuti" <[email protected]> wrote: > >> Am 08.01.2016 um 16:51 schrieb <[email protected]> >><[email protected]>: >> >> Reuti, thank you for your support. I think I will leave it entirely up >>to >> the user to provide a location for the checkpointing files within his >>home >> directory, and use the transparent interface only to provide a signal to >> initiate the checkpoint generation. Certainly, one drawback with this >> approach is that SGE_CKPT_DIR will point to a location outside the >>user¹s >> home directory where the user has no write permission. This renders the >> provided env variable useless, and may only confuse the user. > >In my clusters I created a directory /home/checkpoint where SGE_CKPT_DIR >points to as a central place where the users have write access and set >the sticky bit for it, so that it behaves like /tmp. > >-- Reuti > > >> Best regards, >> Nico >> >> >> >> >> >> On 08.01.16 16:07, "Reuti" <[email protected]> wrote: >> >>> >>>> Am 08.01.2016 um 15:38 schrieb Reuti <[email protected]>: >>>> >>>> Hi, >>>> >>>>> Am 08.01.2016 um 14:51 schrieb [email protected]: >>>>> >>>>> Dear all >>>>> >>>>> We are using OGS/GE 2011.11. I¹m evaluating the built in >>>>>checkpointing >>>>> support. I would like to store checkpoints in the job owner¹s home >>>>> directory. Setting ckpt_dir (in the configuration of the >>>>>checkpointing >>>>> environment, transparent interface) to a path containing a variable >>>>> (e.g. $HOME/checkpointing) seems not possible, right? Each user has >>>>> therefore to provide the location of the checkpoint files from within >>>>> the job, and the env var $SGE_CKPT_DIR is useless in this case. Is >>>>>that >>>>> true, or do I miss something? >>>> >>>> AFAICS ckpt_dir is just a central place to define a string there. >>>>Hence >>>> it could still be set to $HOME/checkpoint and works for all users as >>>> long as the variable is expanded in their scripts. >>>> >>>> All checkpoint processes are executed under the particular user >>>>account >>>> and so the access to it should be possible. >>>> >>>> In case you prefer a canonical name, it might indeed be necessary to >>>> evaluate inside the jobscript and all the used checkpointing scripts >>>> something like: >>>> >>>> CKPT_DIR=$(readlink -f $SGE_CKPT_DIR)* >>> >>> Well, I tested the stuff initially with ${!SGE_CKPT_DIR}. But it only >>> works for a plain HOME setting in ckpt_dir of course - what I missed. >>> >>> Sorry for the confusion. Unless you dare to use `eval` to the statement >>> above (or use two statements to look to $HOME first) I think there is >>>no >>> generic way. >>> >>> -- Reuti >>> >>>> >>>> Sure, the plain definition could be placed in each script, but this >>>>way >>>> it's necessary to change it only in the checkpointing definition in >>>>case >>>> you want to move it to a different location. >>>> >>>> -- Reuti >>>> >>>> *) This could be placed in a starter_method too. >>> >> >> > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
