[slurm-dev] Re: expand GPU jobs

Geert Geurts Wed, 22 Feb 2017 01:18:14 -0800


Maybe I've described the problem unclear....

Effectively the problem occurs when the number of GPUs needed for a jobis uneven and greater then the number of GPUs hosted by one node. Let meclarify with some examples:


* Needed 4 GPUs: No problem, fits on one node.

* Needed 5 GPUs: Means a problem... The most close you would get byrequesting 2 nodes with each 3 GPU's, one GPU will be left unused...


Are there possibilities to circumvent this problem?

Best regards,
Geert

On 02/21/2017 09:51 AM, Geert Geurts wrote:

Hello List,
I'm trying to help clients schedule GPU jobs where it is needed thatthe clients can utilize their GPUs to the full.With using their GPUs to the full I mean each GPU is occupied with aGPU job independent of possible interference of other jobs orinefficiency of inter GPU communication.So the client has a 3 node cluster, with 2 nodes containing 4x nvidiap100 GPUs, and 1 node containing 4x nvidia k40 GPUs. My client wantsto be able to allocate ONLY the needed number of GPUs to his job. Thisis possible for as long as the job doesn't need more then the numberof GPUs in one node. If this client wants to allocate 5 GPUs, I'm notable to allocate 4 GPU's on one node and 1 GPU on a second... Doesslurm have a solution for this problem?
Best regards,
Geert

[slurm-dev] Re: expand GPU jobs

Reply via email to