We're in the midst of transitioning our SGE cluster to slurm 20.02.6, running
on up-to-date CentOS-7. We built RPMs from the standard tarball against CUDA
10.1. These RPMs worked just fine on our first GPU test node (with Tesla K80s)
using "AutoDetect=nvml" in /etc/gres.conf. However, we
My mistake - from slurm.conf(5):
SrunProlog runs on the node where the "srun" is executing.
i.e. the login node, which explains why the directory is not being created on
the compute node, while the echos work.
--
David Chin, PhD (he/him) Sr. SysAdmin, URCF, Drexel
dw...@drexel.edu
Hi all:
Prentice wrote:
> I don't see how that bug is related. That bug is about requiring the
> libnvidia-ml.so library for an RPM that was built with NVML
> Autodetect enabled. His problem is the opposite - he's already using
> NVML autodetect, but wants to disable that feature on a single
Well, reading the source it looks like xcgroup_set_params is just writing to
the devices.allow and devices.deny files. I haven't yet found what cg->path is
being set to but presumably it is too
/sys/fs/cgroup/slurm/uid_##/job_#/step_0 or equivalent for the job
in question.
I'm
Hello
I am trying to debug an issue with EGL support (updated NVIDIA drivers and
now EGLGetDisplay and EGLQueryDevicesExt are failing if they can't access all
/dev/nvidia# devices in slurm) and am wondering how slurm uses device cgroups
so I can implement the same cgroup setup by hand and
They're also listed on the sacct online man page:
https://slurm.schedmd.com/sacct.html
Scroll down until you see the text box with the white text on a black
background - you can't miss it.
Also, depending how your parsing the output, you might want to skip
printing out the headers, which
Hi Brian:
This works just as I expect for sbatch.
The example srun execution I showed was a non-array job, so the first half of
the "if []" statement holds. It is the second half, which deals with job
arrays, which has the period.
The value of TMP is correct, i.e. "/local/scratch/80472"
And
Hi, Brian:
So, this is my SrunProlog script -- I want a job-specific tmp dir, which makes
for easy cleanup at end of job:
#!/bin/bash
if [[ -z ${SLURM_ARRAY_JOB_ID+x} ]]
then
export TMP="/local/scratch/${SLURM_JOB_ID}"
export TMPDIR="${TMP}"
export LOCAL_TMPDIR="${TMP}"
export
It seems to me, if you are using srun directly to get an interactive
shell, you can just run the script once you get your shell.
You can set the variables and then run srun. It automatically exports
the environment.
If you want to change a particular one (or more), use something like
Dear Slurm users,
we are running a cluster that has a flat account structure. All accounts
have a monthly limit that can only change on the 1st of a month. Users
assigned to the very same account shall not compete against each other
(created with fairshare=parent) and their fairshare shall be
10 matches
Mail list logo