[slurm-users] Extremely sluggish squeue -p partition

2020-12-07 Thread Williams, Jenny Avis
I have an interesting condition that has been going on for a few days that could use the feedback of those more familiar with the way slurm works under the hood. Conditions : Slurm v20.02.3 The cluster is relatively quiet given the time of year, and the commands are running on the host on which

Re: [slurm-users] [EXT] job_submit.lua - choice of error on failure / job_desc.gpus?

2020-12-07 Thread Sean Crosby
Hi Loris, We have a completely separate test system, complete with a few worker nodes, separate slurmctld/slurmdbd, so we can test Slurm upgrades etc. Sean -- Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead Research Computing Services | Business Services The University of Melbourne,

Re: [slurm-users] [EXT] Re: pmix issue

2020-12-07 Thread Philip Kovacs
Make sure the .so symlink for the pmix lib is available -- not just the versioned .so, e.g. .so.2.   Slurm requires that .so symlink.  Some distros split packages into base/devel, so you may need to install a pmix-devel package, if available, in order to add the .so symlink (which is

Re: [slurm-users] submit_plugin.lua: distinguish between batch and interactive usage

2020-12-07 Thread Loris Bennett
Hi, Thanks for the example. Checking the value of job_desc.script seemed a little indirect to me, so I wondered if there were another way, but apparently not. Cheers, Loris Lech Nieroda writes: > Hello, > > It’s certainly possible to check whether the job is interactive or not, e.g. > > if

Re: [slurm-users] submit_plugin.lua: distinguish between batch and interactive usage

2020-12-07 Thread Lech Nieroda
Hello, It’s certainly possible to check whether the job is interactive or not, e.g. if job_desc.script == nil or job_desc.script == '' then slurm.log_info("slurm_job_submit: jobscript is missing, assuming interactive job") else slurm.log_info("slurm_job_submit: jobscript is present,

Re: [slurm-users] [EXT] Re: pmix issue

2020-12-07 Thread Andy Riebs
Hi Phil, From a distance, it feels like there may be a mismatch in Slurm versions (an auxiliary build hiding out somewhere?). You might try something like $ which srun; srun which srun Just to confirm that both the submit and execute nodes are running the same slurm instance. Andy On

[slurm-users] submit_plugin.lua: distinguish between batch and interactive usage

2020-12-07 Thread Loris Bennett
Hi, I would like to restrict interactive usage by, say, having a lower maximum run-time for interactive jobs. Is there checking the value of job_desc.script the best way of determining whether a job has been submitted via sbatch or not? Cheers, Loris -- Dr. Loris Bennett (Hr./Mr.) ZEDAT,

Re: [slurm-users] [EXT] Re: pmix issue

2020-12-07 Thread Yuengling, Philip J.
Thanks Andy, Slurm was compiled with --with-pmix=/share/local/pmix-3.2.1. The build of pmix is installed under /share/local/pmix-3.2.1 which is an NFS share across all the nodes. I should also note I used devtoolset-10 (gcc 10) on RHEL7 and confirmed that everything was compiled with that

Re: [slurm-users] [EXT] job_submit.lua - choice of error on failure / job_desc.gpus?

2020-12-07 Thread Loris Bennett
Hi Sean, Thanks for the code - looks like you have put a lot more thought into it than I have into mine. I'll certainly have to look at handling the 'tres-per-*' options. By the way, how to you do your testing? As I don't have at test cluster, currently I'm doing "open heart" testing, but I