Re: [slurm-users] slurm, gres:gpu, only 1 GPU out of 4 is detected

2019-11-13 Thread Chris Samuel
On Wednesday, 13 November 2019 10:11:30 AM PST Tamas Hegedus wrote: > Thanks for your suggestion. You are right, I do not have to deal with > specific GPUs. > (I have not tried to compile your code, I simply tested two gromacs runs > on the same node with -gres=gpu:1 options.) How are you

Re: [slurm-users] Upgrade slurm to 19.05.3 from 18.08.7

2019-11-13 Thread Bas van der Vlies
On 11/13/19 8:36 PM, Christopher Samuel wrote: https://slurm.schedmd.com/quickstart_admin.html#upgrade As Ole says, *always* upgrade slurmdbd first, then slurmctld and finally slurmd's.  This is required because of the way the RPC protocol support for older versions works. Thanks

Re: [slurm-users] Upgrade slurm to 19.05.3 from 18.08.7

2019-11-13 Thread Bas van der Vlies
Hi Bas, Your order of upgrading is *disrecommended*, see for example page 6 in the presentation "Field Notes From A MadMan, Tim Wickberg, SchedMD" in the page https://slurm.schedmd.com/publications.html Versions may be mixed as follows: slurmdbd >= slurmctld >= slurmd >= commands Thanks

Re: [slurm-users] Upgrade slurm to 19.05.3 from 18.08.7

2019-11-13 Thread Christopher Samuel
On 11/13/19 10:42 AM, Ole Holm Nielsen wrote: Your order of upgrading is *disrecommended*, see for example page 6 in the presentation "Field Notes From A MadMan, Tim Wickberg, SchedMD" in the page https://slurm.schedmd.com/publications.html Also the documentation for upgrading here:

Re: [slurm-users] Upgrade slurm to 19.05.3 from 18.08.7

2019-11-13 Thread Ole Holm Nielsen
On 13-11-2019 18:04, Bas van der Vlies wrote: We have currently version 18.08.7 installed on our cluster and want to upgrade to 19.03.3.. So I wanted to start small and installed it one of our compute node. Buy if I start the 'slurmd' then our slurmctld will complain that: {{{

Re: [slurm-users] slurm, gres:gpu, only 1 GPU out of 4 is detected

2019-11-13 Thread Tamas Hegedus
Thanks for your suggestion. You are right, I do not have to deal with specific GPUs. (I have not tried to compile your code, I simply tested two gromacs runs on the same node with -gres=gpu:1 options.) On 11/13/19 5:17 PM, Renfro, Michael wrote: Pretty sure you don’t need to explicitly

[slurm-users] Upgrade slurm to 19.05.3 from 18.08.7

2019-11-13 Thread Bas van der Vlies
We have currently version 18.08.7 installed on our cluster and want to upgrade to 19.03.3.. So I wanted to start small and installed it one of our compute node. Buy if I start the 'slurmd' then our slurmctld will complain that: {{{ 2019-11-13T17:49:37.402] error: slurm_unpack_received_msg:

Re: [slurm-users] slurm, gres:gpu, only 1 GPU out of 4 is detected

2019-11-13 Thread Renfro, Michael
Pretty sure you don’t need to explicitly specify GPU IDs on a Gromacs job running inside of Slurm with gres=gpu. Gromacs should only see the GPUs you have reserved for that job. Here’s a verification code you can run to verify that two different GPU jobs see different GPU devices (compile with

[slurm-users] slurm, gres:gpu, only 1 GPU out of 4 is detected

2019-11-13 Thread Tamas Hegedus
Hi, I run gmx 2019 using GPU There are 4 GPUs in my GPU hosts. I have slurm and configured gres=gpu 1. If I submit a job with --gres=gpu:1 then GPU#0 is identified and used (-gpu_id $CUDA_VISIBLE_DEVICES). 2. If I submit a second job, it fails: the $CUDA_VISIBLE_DEVICES is 1 and selected, but