Hi, I had a quick question about the behavior of the OverSubscribe setting on partitions. In my setup (17.02.1-2) I have a node that belongs to two partition, and am using select/con_res with CR_Core_memory. With OverSubscribe=NO I can submit a job to the same node from both partitions, and both will start execution immediately. However, trying the same thing with OverSubscribe=YES one of the jobs will go into the PD state until the other finishes. If I specify the -s flag, both jobs will run concurrently. According to [0] OverSubscribe=YES should behave the same as =NO by default unless a flag is passed, but I think I’m seeing different behavior. Here are some outputs illustrating the issue: [shadosub|04:33 PM]$ scontrol show partition | grep -e '^PartitionName' -e ‘OverSubscribe' PartitionName=test PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO PartitionName=test2 PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO [shadosub|04:35 PM]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 385 test test.sh sabobbin R 0:01 1 shado00 386 test2 test.sh sabobbin R 0:01 1 shado00 [shadosub|04:35 PM]$ scontrol show partition | grep -e '^PartitionName' -e ‘OverSubscribe' PartitionName=test PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=YES:4 PartitionName=test2 PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=YES:4 [shadosub|04:45 PM]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 389 test2 test.sh sabobbin PD 0:00 1 (Resources) 388 test test.sh sabobbin R 0:03 1 shado00 [shadosub|04:45 PM]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 390 test test.sh sabobbin R 0:02 1 shado00 391 test test.sh sabobbin R 0:02 1 shado00 As shown, the jobs will also execute concurrently if submitted to the same partition. To me this seems like a bug, but I wanted to ping the group to see if I’m missing an option or misunderstanding the expected behavior. Thanks, —Shawn |
slurm.conf
Description: Binary data
slurm.info
Description: Binary data