[slurm-dev] Re: GPU node allocation policy

Schmidtmann, Carl Tue, 07 Apr 2015 06:24:57 -0700

That only works if ALL the nodes have GPUs. We have 200+ nodes and 30 of them 
have GPUs. So we have to create three partitions - standard, gpu and  
cpufromgpunode. People in the standard partition can’t use the cpus on the gpu 
nodes. People that submit to the cpufromgpunode can’t use the cpus in the 
standard partition. We would like to see a way to specify 
MaxCPUsPerJobOnThisNode so the standard partition can use 24 cores on nodes 
without a GPU and less on nodes with a GPU. Or a way to specify 
ReserveCPUForGPU on the node or some such thing. I assume this is difficult 
because people have asked for it but it hasn’t been implemented.


Carl

Carl Schmidtmann
Center for Integrated Research Computing
University of Rochester





> On Apr 7, 2015, at 4:51 AM, Aaron Knister <aaron.knis...@gmail.com> wrote:
> 
> Would MaxCPUsPerNode set at the partition level help?
> 
> Here's the snippet from the man page:
> 
> MaxCPUsPerNode
> Maximum number of CPUs on any node available to all jobs from this partition. 
> This can be especially useful to schedule GPUs. For example a node can be 
> associated with two Slurm partitions (e.g. "cpu" and "gpu") and the 
> partition/queue "cpu" could be limited to only a subset of the node's CPUs, 
> insuring that one or more CPUs would be available to jobs in the "gpu" 
> partition/queue.
> 
> Sent from my iPhone
> 
> On Apr 6, 2015, at 11:25 PM, Novosielski, Ryan <novos...@ca.rutgers.edu> 
> wrote:
> 
>> I am imagine part of the reason is to keep people from running CPU jobs that 
>> would take more than 20 cores on the GPU machine as others do not have 
>> GPU's. I'd be interested in knowing strategies here too. 
>> 
>> ____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
>> || \\UTGERS      |---------------------*O*---------------------
>> ||_// Biomedical | Ryan Novosielski - Senior Technologist
>> || \\ and Health | novos...@rutgers.edu- 973/972.0922 (2x0922)
>> ||  \\  Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
>>     `'
>> 
>> On Apr 6, 2015, at 20:17, Ryan Cox <ryan_...@byu.edu> wrote:
>> 
>>> 
>>> Chris,
>>> 
>>> Just have GPU users request the numbers of CPU cores that they need and 
>>> don't lie to Slurm about the number of cores.  If a GPU user needs 4 
>>> cores and 4 GPUs, have them request that.  That leaves 20 cores for 
>>> others to use.
>>> 
>>> Ryan
>>> 
>>> On 04/06/2015 03:43 PM, Christopher B Coffey wrote:
>>>> Hello,
>>>> 
>>>> I’m curious how you handle the allocation of GPU’s and cores on GPU
>>>> systems in your cluster.  My new GPU system is 24 core, with 2 Tesla K80’s
>>>> (4 gpus total).  We allocate cores/mem by:
>>>> 
>>>> SelectType=select/cons_res
>>>> SelectTypeParameters=CR_Core_Memory
>>>> 
>>>> 
>>>> What I’m thinking of doing is lying to Slurm about the true cores, and
>>>> specifying CPUs=20, along with Gres=gpu:tesla:4.  Is this a reasonable
>>>> solution in order to ensure there is a core reserved for each gpu in the
>>>> system?  My thought is to allocate the 20 cores on the system to non-GPU
>>>> type work instead of leaving them idle.
>>>> 
>>>> Thanks!
>>>> 
>>>> Chris
>>>> 
>>>> 
>

[slurm-dev] Re: GPU node allocation policy

Reply via email to