Re: [slurm-users] RES: RES: multiple srun commands in the same SLURM script

2023-11-01 Thread Kevin Broch
Could this apply in your case: https://slurm.schedmd.com/faq.html#opencl_pmix ? On Wed, Nov 1, 2023 at 5:24 AM Paulo Jose Braga Estrela < paulo.estr...@petrobras.com.br> wrote: > Yeah, you are right. I don’t know why but it seems that my email client > messed with message formatting putting all

[slurm-users] trying to configure preemption partitions and also non-preemption with OverSubcribe=FORCE

2023-06-14 Thread Kevin Broch
The general idea is to have priority batch partitions with preemptions that can occur for higher priority jobs (suspending the lower priority). Also there's an interactive partition where users can run GUI tools that can't be preempted. This works fine up to the point that I would like to

Re: [slurm-users] Partition Hold/Release

2023-03-15 Thread Kevin Broch
Nicolas, It looks like for the partition named "test" you still have *PreemptMode=off ?* On Wed, Mar 15, 2023 at 7:35 AM Wagner, Marcus wrote: > Hi Nicolas, > > > sorry to say, but we have no experience with preemption. > > > Best > > Marcus > > > Am 14.03.2023 um 22:07 schrieb Nicolas Sonoda:

Re: [slurm-users] linting slurm.conf files

2023-01-27 Thread Kevin Broch
al syntax > checker (https://bugs.schedmd.com/show_bug.cgi?id=3435). > > -Paul Edmon- > On 1/27/23 2:36 PM, Kevin Broch wrote: > > I'm wondering what others use to lint their slurm.conf files to give more > confidence that the changes are valid. > > I came across https:/

[slurm-users] linting slurm.conf files

2023-01-27 Thread Kevin Broch
I'm wondering what others use to lint their slurm.conf files to give more confidence that the changes are valid. I came across https://github.com/appeltel/slurmlint which was somewhat functional but since it hasn't been updated since 2019, when I ran it against a valid slurm.conf file based on a

Re: [slurm-users] Cannot enable Gang scheduling

2023-01-13 Thread Kevin Broch
ry. > > I tried also: REQUEUE,GANG and CANCEL,GANG. > > None of these options seems to be able to preempt GPU jobs > > On Fri, 13 Jan 2023 at 12:30, Kevin Broch wrote: > >> My guess, is that this isn't possible with GANG,SUSPEND. GPU memory >> isn't managed in

Re: [slurm-users] Cannot enable Gang scheduling

2023-01-13 Thread Kevin Broch
t; OverSubscribe=NO >OverTimeLimit=NONE PreemptMode=GANG,SUSPEND >State=UP TotalCPUs=64 TotalNodes=1 SelectTypeParameters=NONE >JobDefaults=DefCpuPerGPU=2 >DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED > > On Fri, 13 Jan 2023 at 11:16, Kevin Broch w

Re: [slurm-users] Cannot enable Gang scheduling

2023-01-13 Thread Kevin Broch
Problem might be that OverSubscribe is not enabled? w/o it, I don't believe the time-slicing can be GANG scheduled Can you do a "scontrol show partition" to verify that it is? On Thu, Jan 12, 2023 at 6:24 PM Helder Daniel wrote: > Hi, > > I am trying to enable gang scheduling on a server with