Re: [slurm-users] Trouble installing slurm-20.02.4-1.amzn2.x86_64 libnvidia-ml.so.1

2020-12-04 Thread Christopher Samuel
Hi Drew, On 12/4/20 11:32 am, Mullen, Drew wrote: Error: Package: slurm-20.02.4-1.amzn2.x86_64 (/slurm-20.02.4-1.amzn2.x86_64)    Requires: libnvidia-ml.so.1()(64bit That looks like it's fixed in 20.02.5 (the current release is 20.02.6):

[slurm-users] Trouble installing slurm-20.02.4-1.amzn2.x86_64 libnvidia-ml.so.1

2020-12-04 Thread Mullen, Drew
Howdy Im getting this error installing slurm 20.02.4: Error: Package: slurm-20.02.4-1.amzn2.x86_64 (/slurm-20.02.4-1.amzn2.x86_64) Requires: libnvidia-ml.so.1()(64bit # ldconfig -p|grep libnvidia-ml.so.1 libnvidia-ml.so.1 (libc6,x86-64) => /lib64/libnvidia-ml.so.1

Re: [slurm-users] [External] Re: can't lengthen my jobs log

2020-12-04 Thread Prentice Bisbal
I know I'm very late to this thread, but were/are you using the --allusers flag to sacct? If not, sacct only returns results for the user running the command (not sure if this is the case for root - I never need to run sacct as root). This minor detail tripped me up a few days ago when I was

Re: [slurm-users] Novice Slurm Upgrade Questions

2020-12-04 Thread Paul Edmon
It won't figure it out automatically no.  You will need to ensure that the spec is installing to the same locale as your vendor installed it if they didn't put it in the default location (/opt isn't the default). -Paul Edmon- On 12/4/2020 3:39 PM, Jason Simms wrote: Dear Ole, Thanks. I've

Re: [slurm-users] Novice Slurm Upgrade Questions

2020-12-04 Thread Jason Simms
Dear Ole, Thanks. I've read through your docs many times. The relevant upgrade section begins with the assumption that you have properly configured RPMs, so all I'm trying to do is ensure I get to that point. As I noted, a vendor installed Slurm initially through a proprietary script, though they

Re: [slurm-users] pmix issue

2020-12-04 Thread Andy Riebs
Also, Slurm was built with "/fs/local/pmix-3.2.1" -- does that translate well to "/share/local/pmix-3.2.1"? Andy On 12/4/2020 2:59 PM, Andy Riebs wrote: Are you sure that /share/local/pmix-3.2.1 exists on the compute nodes? On 12/4/2020 2:54 PM, Yuengling, Philip J. wrote: Hi everyone,

Re: [slurm-users] pmix issue

2020-12-04 Thread Andy Riebs
Are you sure that /share/local/pmix-3.2.1 exists on the compute nodes? On 12/4/2020 2:54 PM, Yuengling, Philip J. wrote: Hi everyone, I’ve been having difficulty getting the --mpi=pmix_v3 option to work for me.  I can get --mpi=pmi2 to work ok, but I really want to understand what I’m doing

[slurm-users] pmix issue

2020-12-04 Thread Yuengling, Philip J.
Hi everyone, I’ve been having difficulty getting the --mpi=pmix_v3 option to work for me. I can get --mpi=pmi2 to work ok, but I really want to understand what I’m doing wrong here. Everything seems to build ok. $ srun --mpi=list srun: MPI types are... srun: pmix srun: pmix_v3 srun:

Re: [slurm-users] Novice Slurm Upgrade Questions

2020-12-04 Thread Paul Edmon
Usually the slurm.spec file provided doesn't change that much between versions.  What we do here is that we maintain a git repository of our slurm.spec that we use with our modifications. Then each time Slurm is released we compare ours against what is provided, and simply modify the provided

Re: [slurm-users] [External] Re: can't lengthen my jobs log

2020-12-04 Thread Ryan Novosielski
As root, -a is effectively applied to every command I’m aware of. -- #BlackLivesMatter || \\UTGERS, |---*O*--- ||_// the State | Ryan Novosielski - novos...@rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922)

[slurm-users] Novice Slurm Upgrade Questions

2020-12-04 Thread Jason Simms
Hello all, Thank you for being such a helpful resource for All Things Slurm; I sincerely appreciate the helpful feedback. Right now, we are running 20.02 and considering upgrading to 20.11 during our next maintenance window in January. This will be the first time we have upgraded Slurm, so

Re: [slurm-users] [EXT] job_submit.lua - choice of error on failure / job_desc.gpus?

2020-12-04 Thread Sean Crosby
Hi Loris, This is our submit filter for what you're asking. It checks for both --gres and --gpus ESLURM_INVALID_GRES=2072 ESLURM_BAD_TASK_COUNT=2025 if ( job_desc.partition ~= slurm.NO_VAL ) then if (job_desc.partition ~= nil) then if (string.match(job_desc.partition,"gpgpu") or

[slurm-users] job_submit.lua - choice of error on failure / job_desc.gpus?

2020-12-04 Thread Loris Bennett
Hi, I want to reject jobs that don't specify any GPUs when accessing our GPU partition and have the following in job_submit.lua: if (job_desc.partition == "gpu" and job_desc.gres == nil ) then slurm.log_user(string.format("Please request GPU resources in the partition 'gpu', " ..