date:20210304

[slurm-users] Set all Fairshares manually

2021-03-04 Thread Michael Müller

Dear Slurm users, we are running a cluster that has a flat account structure. All accounts have a monthly limit that can only change on the 1st of a month. Users assigned to the very same account shall not compete against each other (created with fairshare=parent) and their fairshare shall be cal

Re: [slurm-users] prolog not passing env var to job

2021-03-04 Thread Brian Andrus

It seems to me, if you are using srun directly to get an interactive shell, you can just run the script once you get your shell. You can set the variables and then run srun. It automatically exports the environment. If you want to change a particular one (or more), use something like --ex

Re: [slurm-users] prolog not passing env var to job

2021-03-04 Thread Chin,David

Hi, Brian: So, this is my SrunProlog script -- I want a job-specific tmp dir, which makes for easy cleanup at end of job: #!/bin/bash if [[ -z ${SLURM_ARRAY_JOB_ID+x} ]] then export TMP="/local/scratch/${SLURM_JOB_ID}" export TMPDIR="${TMP}" export LOCAL_TMPDIR="${TMP}" export BE

Re: [slurm-users] prolog not passing env var to job

2021-03-04 Thread Brian Andrus

I think it isn't running how you think or there is something not provided in the description. You have: export TMP="/local/scratch/${SLURM_ARRAY_JOB_ID}.${SLURM_ARRAY_TASK_ID}" Notice that period in there. Then you have: node001::~$ echo $TMP /local/scratch/80472 There is

Re: [slurm-users] prolog not passing env var to job

2021-03-04 Thread Chin,David

Hi Brian: This works just as I expect for sbatch. The example srun execution I showed was a non-array job, so the first half of the "if []" statement holds. It is the second half, which deals with job arrays, which has the period. The value of TMP is correct, i.e. "/local/scratch/80472" And t

Re: [slurm-users] [External] Re: About sacct --format: how can I get info about the fields

2021-03-04 Thread Prentice Bisbal

They're also listed on the sacct online man page: https://slurm.schedmd.com/sacct.html Scroll down until you see the text box with the white text on a black background - you can't miss it. Also, depending how your parsing the output, you might want to skip printing out the headers, which ca

[slurm-users] slurm and device cgroups

2021-03-04 Thread Ransom, Geoffrey M.

Hello I am trying to debug an issue with EGL support (updated NVIDIA drivers and now EGLGetDisplay and EGLQueryDevicesExt are failing if they can't access all /dev/nvidia# devices in slurm) and am wondering how slurm uses device cgroups so I can implement the same cgroup setup by hand and te

Re: [slurm-users] slurm and device cgroups

2021-03-04 Thread Ransom, Geoffrey M.

Well, reading the source it looks like xcgroup_set_params is just writing to the devices.allow and devices.deny files. I haven't yet found what cg->path is being set to but presumably it is too /sys/fs/cgroup/slurm/uid_##/job_#/step_0 or equivalent for the job in question. I'm sti

Re: [slurm-users] [External] Re: exempting a node from Gres Autodetect

2021-03-04 Thread Paul Brunk

Hi all: Prentice wrote: > I don't see how that bug is related. That bug is about requiring the > libnvidia-ml.so library for an RPM that was built with NVML > Autodetect enabled. His problem is the opposite - he's already using > NVML autodetect, but wants to disable that feature on a single node,

Re: [slurm-users] prolog not passing env var to job

2021-03-04 Thread Chin,David

My mistake - from slurm.conf(5): SrunProlog runs on the node where the "srun" is executing. i.e. the login node, which explains why the directory is not being created on the compute node, while the echos work. -- David Chin, PhD (he/him) Sr. SysAdmin, URCF, Drexel dw...@drexel.edu

Re: [slurm-users] About sacct --format: how can I get info about the fields

2021-03-04 Thread xiaojingh...@163.com

Hello, slurm users and Brian, Thanks a lot for your reply. The thing is actually I know the fields. I just need to know detailed info about them. For example, you may get an “Unknown” for some time fields. And the maxVMSize field is an empty string except for some job steps. I need to know the

[slurm-users] NVML autodetect "Failed to get supported memory frequencies" error

2021-03-04 Thread Joshua Baker-LePain

We're in the midst of transitioning our SGE cluster to slurm 20.02.6, running on up-to-date CentOS-7. We built RPMs from the standard tarball against CUDA 10.1. These RPMs worked just fine on our first GPU test node (with Tesla K80s) using "AutoDetect=nvml" in /etc/gres.conf. However, we just

[slurm-users] About sacct --format: detailed info about the fields

2021-03-04 Thread xiaojinghu93

Hello, guys, I am doing a parsing job on the output of the sacct command and I know fields that could be specified to be outputted. The difficulty I am facing is that I am in lack of detailed info about the fields. I need to do calculation on the fields so I need to understand what values they

[slurm-users] Set all Fairshares manually

Re: [slurm-users] prolog not passing env var to job

Re: [slurm-users] prolog not passing env var to job

Re: [slurm-users] prolog not passing env var to job

Re: [slurm-users] prolog not passing env var to job

Re: [slurm-users] [External] Re: About sacct --format: how can I get info about the fields

[slurm-users] slurm and device cgroups

Re: [slurm-users] slurm and device cgroups

Re: [slurm-users] [External] Re: exempting a node from Gres Autodetect

Re: [slurm-users] prolog not passing env var to job

Re: [slurm-users] About sacct --format: how can I get info about the fields

[slurm-users] NVML autodetect "Failed to get supported memory frequencies" error

[slurm-users] About sacct --format: detailed info about the fields

13 matches

Site Navigation

Mail list logo

Footer information