Hi

I am just building my first Slurm setup and have got everything running - well, 
almost.

I have a two node configuration. All of my setup exists on a single HyperV 
server and I have divided up the resources to create my VMs

One node I will use for heavy duty work; this is called compute001
One node I will use for normal work; this is called compute002

My compute node specification in slurm.conf is
NodeName=DEFAULT CPUs=1 RealMemory=1000 State=UNKNOWN
NodeName=compute001 CPUs=32
NodeName=compute002 CPUs=2

The partition specification is
PartitionName=DEFAULT State=UP
PartitionName=interactive Nodes=compute002 MaxTime=INFINITE OverSubscribe=FORCE
PartitionName=simulation Nodes=compute001 MaxTime=30 OverSubscribe=FORCE


I have added the OverSubscribe=FORCE option as I want more than one job to be 
able to land on my interactive/simulation queues.

All of the nodes and cluster master start up fine and they all talk to each 
other but no matter what I do, I cannot get my cluster to accept more than one 
job per node.


Can you help me determine where I am going wrong?
Thanks a lot
Jake


The entire slurm.conf is pasted below
# slurm.conf file generated by configurator.html.
ClusterName=pm-slurm
SlurmctldHost=slurm-master
MpiDefault=none
ProctrackType=proctrack/cgroup
ReturnToService=2
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
StateSaveLocation=/home/slurm/var/spool/slurmctld
SwitchType=switch/none
TaskPlugin=task/cgroup
#
# TIMERS
InactiveLimit=0
KillWait=30
MinJobAge=300
SlurmctldTimeout=120
SlurmdTimeout=300
Waittime=0
#
# SCHEDULING
SchedulerType=sched/backfill
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory
#
# LOGGING AND ACCOUNTING
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/cgroup
SlurmctldDebug=info
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdDebug=info
SlurmdLogFile=/var/log/slurmd.log

# COMPUTE NODES
NodeName=DEFAULT CPUs=1 RealMemory=1000 State=UNKNOWN
NodeName=compute001 CPUs=32
NodeName=compute002 CPUs=2

PartitionName=DEFAULT State=UP
PartitionName=interactive Nodes=compute002 MaxTime=INFINITE OverSubscribe=FORCE
PartitionName=simulation Nodes=compute001 MaxTime=30 OverSubscribe=FORCE


Reply via email to