Hi Gang

I recently setup a slurm cluster for R studio, a few days ago there was a post 
on here similar to 
this regarding use of all cores for R, and my issue is similar I guess.

All the cores allocate to jobs (1 to each), but what I'd like to do is force 
oversubscription of the 
CPUs so they can take on more than one submitted job at once. Is this possible? 
The reason for this is 
we teach R classes and our usage is short bursts of concurrent activity for 
around 100 - 150 people.

I see the jobs hitting the queue but I have 8 machines with 8 cores each and 
the most I can push it 
to is 64 sessions.

I did try to use oversubscribe on the partition:

PartitionName=rs Nodes=rsslu[1-8] Default=YES DefaultTime=01:00:00 
MaxTime=24:00:00 OverSubscribe=FORCE:8 Shared=yes State=UP

But it didn't seem to make any difference. Any ideas welcome, here's my 
slurm.conf for completeness. 
I feel that I'm missing a jigsaw piece here. (:

Bw
John

ControlMachine=rsslu1
BackupController=rsslu2
#
MpiDefault=none
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
StateSaveLocation=/var/spool/slurm-llnl
SwitchType=switch/none
TaskPlugin=task/none
#
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_MEMORY
#
MinJobAge=86400
#
FastSchedule=1
SchedulerType=sched/backfill
#
AccountingStorageType=accounting_storage/none
ClusterName=cluster
JobAcctGatherType=jobacct_gather/none
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
NodeName=rsslu[1-8] CPUs=8 Boards=1 SocketsPerBoard=8 CoresPerSocket=1 
ThreadsPerCore=1 RealMemory=16017
PartitionName=rs Nodes=rsslu[1-8] Default=YES DefaultTime=01:00:00 
MaxTime=24:00:00 OverSubscribe=FORCE:8 Shared=yes State=UP


-- 
j...@sdf.org
SDF Public Access UNIX System - http://sdf.org

Reply via email to