[slurm-users] Per-job TMPDIR: how to lookup gres allocation in prolog?

Mark Dixon Tue, 16 Nov 2021 08:51:00 -0800

Hi everyone,

I'd like to configure slurm such that users can request an amount of diskspace for TMPDIR... and for that request to be reserved and quota'd viacommands like "sbatch --gres tmp:10G jobscript.sh". Probably reinventingsomeone's wheel, but I'm almost there.


I have:

- created a local xfs filesystem, dedicated to per-job TMPDIR directories,
  with project quotas enabled on each slurmd host.

- created (slurmd) Prolog/Epilog scripts which create/delete a per-job
  directory on the xfs filesystem, owned by the job user.

- created SrunProlog/TaskProlog scripts, which set TMPDIR in the user's
  job environment to point at the per-job directory.

- added a gres defined as "Name=tmp Flags=CountOnly"

- modified the node definitions to include the amount of storage on each
  host, by adding "Gres=tmp:270G".

I still need to:

- extend the Prolog script to lookup the "tmp" gres allocation for the
  job.

- extend the Prolog script to set the appropriate project quota on the
  per-job TMPDIR, limiting the amount of space the directory tree can use.

Unfortunately, I've not found anything in the Prolog environment (orstored on disk under /var/spool/slurmd) containing the gres allocationsfor the job.

I figure I can do a "scontrol show job <jobid> -d" from inside the prologto get the job's gres information, but I'll need to hard-code the locationof the scontrol binary... and the Prolog documentation explicitly tellsyou not to execute slurm commands from within the prolog.

Is there a better way to get the job's gres information from within theprolog, please?


Thanks!

Mark

[slurm-users] Per-job TMPDIR: how to lookup gres allocation in prolog?

Reply via email to