Hi,

Am 05.12.2015 um 01:21 schrieb Rajil Saraswat:

> Hello,
> 
> We have a Rocks cluster with two nodes each with 1 GPU and 32 CPUS. I
> have defined a separate queue (gpu.q) containing these two hosts
> compute-4-0 and compute-4-1.
> 
> Since we have only 1 gpu per node, i want the first job to be run on
> compute-4-0 and the second job to be run on compute-4-1. However, with
> my current configuration the second submitted job also tries to
> execute on compute-4-0. How can i force the job to execute on
> compute-4-1 if compute-4-0 is already running something, and
> subsequent submitted jobs should end up in 'qw'?

A GPU job besides a conventional job or two GPU jobs at the same time?


> My config is as follows:
> 
> #qconf -sq gpu.q
> qname gpu.q
> hostlist @gpuhosts
> load_thresholds np_load_avg=1.75
> processors UNDEFINED
> slots 1,[compute-4-0.local=32],[compute-4-1.local=32]
> complex_values NONE
> ----------------------------------------
> #qconf -sc
> gpu gpu INT <= YES YES 0 0
> ----------------------------------------
> #qconf -se compute-4-0.local
> hostname compute-4-0.local
> load_scaling NONE
> complex_values gpu=1

As gpu is set to one here, I wonder how a second GPU job can start here. What 
is the output of:

$ qhost -F gpu

in this case?

-- Reuti


> load_values arch=linux-x64,num_proc=32,mem_total=129091.453125M, \
> swap_total=999.996094M,virtual_total=130091.449219M, \
> load_avg=0.970000,load_short=0.990000, \
> load_medium=0.970000,load_long=0.910000, \
> mem_free=104392.453125M,swap_free=999.996094M, \
> virtual_free=105392.449219M,mem_used=24699.000000M, \
> swap_used=0.000000M,virtual_used=24699.000000M, \ cpu=3.200000, \
> m_topology=SCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCC, \
> m_topology_inuse=SCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCC, \
> m_socket=2,m_core=32,np_load_avg=0.030312, \
> np_load_short=0.030937,np_load_medium=0.030312, \
> np_load_long=0.028438
> processors 32
> -----------------------------------------------
> #qconf -shgrp @gpuhosts
> group_name @gpuhosts
> hostlist compute-4-0.local compute-4-1.local
> 
> 
> Thanks,
> Rajil
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to