Indeed, that’s what I originally wanted to do (after reading your
Checkpointing Howto): Providing a base directory like /home/checkpoint
where all users can create subdirectories (e.g /home/checkpoint/$JOB_ID/)
for storing the checkpoint file of their jobs. But, what if two users
specify a checkpoint file with the same name and try to store it directly
in the base directory (e.g /home/checkpoint/ckpt-file). The second user
could not (and also should not) write to the file because of lacking
permissions, therefore his job won’t be able to store the checkpoints.
This means that a user must either choose a unique name for a subdirectory
(like $JOB_ID), or a unique name for the checkpoint file (if stored
directly in the base directory). Because of those responsibilities on the
part of the user, I thought, why not just leave it up to the user where to
store the checkpoint file within his home directory.

Best,
Nico





On 08.01.16 17:45, "Reuti" <[email protected]> wrote:

>
>> Am 08.01.2016 um 16:51 schrieb <[email protected]>
>><[email protected]>:
>> 
>> Reuti, thank you for your support. I think I will leave it entirely up
>>to
>> the user to provide a location for the checkpointing files within his
>>home
>> directory, and use the transparent interface only to provide a signal to
>> initiate the checkpoint generation. Certainly, one drawback with this
>> approach is that SGE_CKPT_DIR will point to a location outside the
>>user¹s
>> home directory where the user has no write permission. This renders the
>> provided env variable useless, and may only confuse the user.
>
>In my clusters I created a directory /home/checkpoint where SGE_CKPT_DIR
>points to as a central place where the users have write access and set
>the sticky bit for it, so that it behaves like /tmp.
>
>-- Reuti
>
>
>> Best regards,
>> Nico
>> 
>> 
>> 
>> 
>> 
>> On 08.01.16 16:07, "Reuti" <[email protected]> wrote:
>> 
>>> 
>>>> Am 08.01.2016 um 15:38 schrieb Reuti <[email protected]>:
>>>> 
>>>> Hi,
>>>> 
>>>>> Am 08.01.2016 um 14:51 schrieb [email protected]:
>>>>> 
>>>>> Dear all
>>>>> 
>>>>> We are using OGS/GE 2011.11. I¹m evaluating the built in
>>>>>checkpointing
>>>>> support. I would like to store checkpoints in the job owner¹s home
>>>>> directory. Setting ckpt_dir (in the configuration of the
>>>>>checkpointing
>>>>> environment, transparent interface) to a path containing a variable
>>>>> (e.g. $HOME/checkpointing)  seems not possible, right? Each user has
>>>>> therefore to provide the location of the checkpoint files from within
>>>>> the job, and the env var $SGE_CKPT_DIR is useless in this case. Is
>>>>>that
>>>>> true, or do I miss something?
>>>> 
>>>> AFAICS ckpt_dir is just a central place to define a string there.
>>>>Hence
>>>> it could still be set to $HOME/checkpoint and works for all users as
>>>> long as the variable is expanded in their scripts.
>>>> 
>>>> All checkpoint processes are executed under the particular user
>>>>account
>>>> and so the access to it should be possible.
>>>> 
>>>> In case you prefer a canonical name, it might indeed be necessary to
>>>> evaluate inside the jobscript and all the used checkpointing scripts
>>>> something like:
>>>> 
>>>> CKPT_DIR=$(readlink -f $SGE_CKPT_DIR)*
>>> 
>>> Well, I tested the stuff initially with ${!SGE_CKPT_DIR}. But it only
>>> works for a plain HOME setting in ckpt_dir of course - what I missed.
>>> 
>>> Sorry for the confusion. Unless you dare to use `eval` to the statement
>>> above (or use two statements to look to $HOME first) I think there is
>>>no
>>> generic way.
>>> 
>>> -- Reuti
>>> 
>>>> 
>>>> Sure, the plain definition could be placed in each script, but this
>>>>way
>>>> it's necessary to change it only in the checkpointing definition in
>>>>case
>>>> you want to move it to a different location.
>>>> 
>>>> -- Reuti
>>>> 
>>>> *) This could be placed in a starter_method too.
>>> 
>> 
>> 
>


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to