Hey Joe,


Have you considered using a reservation?

An operator can reserve a (set of) nodes for a given time, and as a user,
 you would simply

submit your jobs within this reservation.

Depending on your system configuration, a node might be marked as down 
if you reboot it,

and an operator would have to make it back into SLURM.



FWIW, the KNL slurm plugin had some features that might be interesting 
for you:

an end user submit a job with a required cluster and/or memory mode, and 
the node config is automatically

updated and the node rebooted if needed, and then the job starts. No 
operator intervention is required

in the process.





Cheers,



Gilles 

----- Original Message -----

Hello!

How best for a user to check out a slurm node?

Unfortunately, command 'salloc' doesn't appear to meet this need.

Command `salloc --nodelist some_node --time 3:00:00`
This gives the user a new shell and the user can use `srun` to start an 
interactive session.

However, if the user needs to reboot the node, set BIOS settings, etc 
then `salloc` automatically terminates the allocation when the new shell 
 is closed.

salloc: Relinquishing job allocation 82
salloc: Job allocation 82 has been revoked.

Ideally, if a user requests a node for a few hours then they can do all 
of their work in the allotted time (srun sessions, reboots, BIOS 
settings, etc) using a single job allocation.

Also, how can I reply to posts and replies on 
https://groups.google.com/g/slurm-users/?
The 'Reply all' and 'Reply to author' buttons on the site are greyed out.

Much appreciated!


 

Reply via email to