[slurm-dev] Re: slurm-dev Re: Job allocation for GPU jobs doesn't work using gpu plugin (node configuration not available)

Daniel Weber Wed, 06 May 2015 13:22:39 -0700

Hi John,

I added the types into slurm.conf and the gres.conf files on the nodes again 
and included a gres.conf on the controller node - without any success.


Slurm rejects jobs with "--gres=gpu:1" or "--gres=gpu:tesla:1".

slurm.conf

NodeName=smurf01 NodeAddr=192.168.1.101 Feature="intel,fermi" Boards=1 
SocketsPerBoard=2 CoresPerSocket=6 ThreadsPerCore=2 
Gres=gpu:tesla:8,ram:48,gram:no_consume:6000,scratch:1300
...

gres.conf on controller

NodeName=smurf01 Name=gpu Type=tesla File=/dev/nvidia0 Count=1
NodeName=smurf01 Name=gpu Type=tesla File=/dev/nvidia1 Count=1
NodeName=smurf01 Name=gpu Type=tesla File=/dev/nvidia2 Count=1
NodeName=smurf01 Name=gpu Type=tesla File=/dev/nvidia3 Count=1
NodeName=smurf01 Name=gpu Type=tesla File=/dev/nvidia4 Count=1
NodeName=smurf01 Name=gpu Type=tesla File=/dev/nvidia5 Count=1
NodeName=smurf01 Name=gpu Type=tesla File=/dev/nvidia6 Count=1
NodeName=smurf01 Name=gpu Type=tesla File=/dev/nvidia7 Count=1
NodeName=smurf01 Name=ram Count=48
NodeName=smurf01 Name=gram Count=6000
NodeName=smurf01 Name=scratch Count=1300
...

gres.conf on smurf01

Name=gpu Type=tesla File=/dev/nvidia0 Count=1
Name=gpu Type=tesla File=/dev/nvidia1 Count=1
Name=gpu Type=tesla File=/dev/nvidia2 Count=1
Name=gpu Type=tesla File=/dev/nvidia3 Count=1
Name=gpu Type=tesla File=/dev/nvidia4 Count=1
Name=gpu Type=tesla File=/dev/nvidia5 Count=1
Name=gpu Type=tesla File=/dev/nvidia6 Count=1
Name=gpu Type=tesla File=/dev/nvidia7 Count=1
Name=ram Count=48
Name=gram Count=6000
Name=scratch Count=1300

Regards
Daniel

-----Ursprüngliche Nachricht-----
Von: John Desantis [mailto:[email protected]] 
Gesendet: Mittwoch, 6. Mai 2015 21:33
An: slurm-dev
Betreff: [slurm-dev] Re: slurm-dev Re: Job allocation for GPU jobs doesn't work 
using gpu plugin (node configuration not available)


Daniel,

I hit send without completing my message:

# gres.conf
NodeName=blah Name=gpu Type=Tesla-T10 File=/dev/nvidia[0-1]

HTH.

John DeSantis


2015-05-06 15:30 GMT-04:00 John Desantis <[email protected]>:
> Daniel,
>
> You sparked an interest.
>
> I was able to get Gres Types working by:
>
> 1.)  Ensuring that the type was defined in slurm.conf for the nodes in 
> question;
> 2.)  Ensuring that the global gres.conf respected the type.
>
> salloc -n 1 --gres=gpu:Tesla-T10:1
> salloc: Pending job allocation 532507
> salloc: job 532507 queued and waiting for resources
>
> # slurm.conf
> Nodename=blah CPUs=16 CoresPerSocket=4 Sockets=4 RealMemory=129055 
> Feature=ib_ddr,ib_ofa,sse,sse2,sse3,tpa,cpu_xeon,xeon_E7330,gpu_T10,ti
> tan,mem_128G
> Gres=gpu:Tesla-T10:2 Weight=1000
>
> # gres.conf
>
>
> 2015-05-06 15:25 GMT-04:00 John Desantis <[email protected]>:
>>
>> Daniel,
>>
>> "I can handle that temporarily with node features instead but I'd 
>> prefer utilizing the gpu types."
>>
>> Guilty of reading your response too quickly...
>>
>> John DeSantis
>>
>> 2015-05-06 15:22 GMT-04:00 John Desantis <[email protected]>:
>>> Daniel,
>>>
>>> Instead of defining the GPU type in our Gres configuration (global 
>>> with hostnames, no count), we simply add a feature so that users can 
>>> request a GPU (or GPU's) via Gres and the specific model via a 
>>> constraint.  This may help out the situation so that your users can 
>>> request a specific GPU model.
>>>
>>> --srun --gres=gpu:1 -C "gpu_k20"
>>>
>>> I didn't think of it at the time, but I remember running --gres=help 
>>> when initially setting up GPU's to help rule out errors.  I don't 
>>> know if you ran that command or not, but it's worth a shot to verify 
>>> that Gres types are being seen correctly on a node by the 
>>> controller.  I also wonder if using a cluster wide Gres definition 
>>> (vs. only on nodes in question) would make a difference or not.
>>>
>>> John DeSantis
>>>
>>>
>>> 2015-05-06 15:12 GMT-04:00 Daniel Weber <[email protected]>:
>>>>
>>>> Hi John,
>>>>
>>>> I already tried using "Count=1" for each line as well as "Count=8" for a 
>>>> single configuration line as well.
>>>>
>>>> I "solved" (or better circumvented) the problem by removing the "Type=..." 
>>>> specifications from the "gres.conf" files and from the slurm.conf.
>>>>
>>>> The jobs are running successfully without the possibility to request a 
>>>> certain GPU type.
>>>>
>>>> The generic resource examples on schedmd.com explicitly show the "Type" 
>>>> specifications on GPUs and I really would like to use them.
>>>> I can handle that temporarily with node features instead but I'd prefer 
>>>> utilizing the gpu types.
>>>>
>>>> Thank you for your help (and the hint into the right direction).
>>>>
>>>> Kind regards
>>>> Daniel
>>>>
>>>>
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: John Desantis [mailto:[email protected]]
>>>> Gesendet: Mittwoch, 6. Mai 2015 18:16
>>>> An: slurm-dev
>>>> Betreff: [slurm-dev] Re: slurm-dev Re: Job allocation for GPU jobs 
>>>> doesn't work using gpu plugin (node configuration not available)
>>>>
>>>>
>>>> Daniel,
>>>>
>>>> What about a count?  Try adding a count=1 after each of your GPU lines.
>>>>
>>>> John DeSantis
>>>>
>>>> 2015-05-06 11:54 GMT-04:00 Daniel Weber <[email protected]>:
>>>>>
>>>>> The same "problem" occurs when using the grey type in the srun syntax 
>>>>> (using i.e. --gres=gpu:tesla:1).
>>>>>
>>>>> Regards,
>>>>> Daniel
>>>>>
>>>>> --
>>>>> Von: John Desantis [mailto:[email protected]]
>>>>> Gesendet: Mittwoch, 6. Mai 2015 17:39
>>>>> An: slurm-dev
>>>>> Betreff: [slurm-dev] Re: Job allocation for GPU jobs doesn't work 
>>>>> using gpu plugin (node configuration not available)
>>>>>
>>>>>
>>>>> Daniel,
>>>>>
>>>>> We don't specify types in our Gres configuration, simply the resource.
>>>>>
>>>>> What happens if you update your srun syntax to:
>>>>>
>>>>> srun -n1 --gres=gpu:tesla:1
>>>>>
>>>>> Does that dispatch the job?
>>>>>
>>>>> John DeSantis
>>>>>
>>>>> 2015-05-06 9:40 GMT-04:00 Daniel Weber <[email protected]>:
>>>>>> Hello,
>>>>>>
>>>>>> currently I'm trying to set up SLURM on a gpu cluster with a 
>>>>>> small number of nodes (where smurf0[1-7] are the node names) 
>>>>>> using the gpu plugin to allocate jobs (requiring gpus).
>>>>>>
>>>>>> Unfortunately, when trying to run a gpu-job (any number of gpus; 
>>>>>> --gres=gpu:N), SLURM doesn't execute it, asserting unavailability 
>>>>>> of the requested configuration.
>>>>>> I attached some logs and configuration text files in order to 
>>>>>> provide any information necessary to analyze this issue.
>>>>>>
>>>>>> Note: Cross posted here: http://serverfault.com/questions/685258
>>>>>>
>>>>>> Example (using some test.sh which is echoing $CUDA_VISIBLE_DEVICES):
>>>>>>
>>>>>>     srun -n1 --gres=gpu:1 test.sh
>>>>>>         --> srun: error: Unable to allocate resources: Requested 
>>>>>> node configuration is not available
>>>>>>
>>>>>> The slurmctld log for such calls shows:
>>>>>>
>>>>>>     gres: gpu state for job X
>>>>>>         gres_cnt:1 node_cnt:1 type:(null)
>>>>>>         _pick_best_nodes: job X never runnable
>>>>>>         _slurm_rpc_allocate_resources: Requested node 
>>>>>> configuration is not available
>>>>>>
>>>>>> Jobs with any other type of configured generic resource complete
>>>>>> successfully:
>>>>>>
>>>>>>     srun -n1 --gres=gram:500 test.sh
>>>>>>         --> CUDA_VISIBLE_DEVICES=NoDevFiles
>>>>>>
>>>>>> The nodes and gres configuration in slurm.conf (which is attached 
>>>>>> as
>>>>>> well) are like:
>>>>>>
>>>>>>     GresTypes=gpu,ram,gram,scratch
>>>>>>     ...
>>>>>>     NodeName=smurf01 NodeAddr=192.168.1.101 Feature="intel,fermi"
>>>>>> Boards=1
>>>>>> SocketsPerBoard=2 CoresPerSocket=6 ThreadsPerCore=2
>>>>>> Gres=gpu:tesla:8,ram:48,gram:no_consume:6000,scratch:1300
>>>>>>     NodeName=smurf02 NodeAddr=192.168.1.102 Feature="intel,fermi"
>>>>>> Boards=1
>>>>>> SocketsPerBoard=2 CoresPerSocket=6 ThreadsPerCore=1
>>>>>> Gres=gpu:tesla:8,ram:48,gram:no_consume:6000,scratch:1300
>>>>>>
>>>>>> The respective gres.conf files are
>>>>>>     Name=gpu Count=8 Type=tesla File=/dev/nvidia[0-7]
>>>>>>     Name=ram Count=48
>>>>>>     Name=gram Count=6000
>>>>>>     Name=scratch Count=1300
>>>>>>
>>>>>> The output of "scontrol show node" lists all the nodes with the 
>>>>>> correct gres configuration i.e.:
>>>>>>
>>>>>>     NodeName=smurf01 Arch=x86_64 CoresPerSocket=6
>>>>>>        CPUAlloc=0 CPUErr=0 CPUTot=24 CPULoad=0.01 Features=intel,fermi
>>>>>>        Gres=gpu:tesla:8,ram:48,gram:no_consume:6000,scratch:1300
>>>>>>        ...etc.
>>>>>>
>>>>>> As far as I can tell, the slurmd daemon on the nodes recognizes 
>>>>>> the gpus (and other generic resources) correctly.
>>>>>>
>>>>>> My slurmd.log on node smurf01 says
>>>>>>
>>>>>>     Gres Name = gpu Type = tesla Count = 8 ID = 7696487 File = 
>>>>>> /dev
>>>>>> /nvidia[0 - 7]
>>>>>>
>>>>>> The log for slurmctld shows
>>>>>>
>>>>>>     gres / gpu: state for smurf01
>>>>>>        gres_cnt found : 8 configured : 8 avail : 8 alloc : 0
>>>>>>        gres_bit_alloc :
>>>>>>        gres_used : (null)
>>>>>>
>>>>>> I can't figure out why the controller node states that jobs using 
>>>>>> --gres=gpu:N are "never runnable" and why "the requested node 
>>>>>> configuration is not available".
>>>>>> Any help is appreciated.
>>>>>>
>>>>>> Kind regards,
>>>>>> Daniel Weber
>>>>>>
>>>>>> PS: If further information is required, don't hesitate to ask.

[slurm-dev] Re: slurm-dev Re: Job allocation for GPU jobs doesn't work using gpu plugin (node configuration not available)

Reply via email to