Hi, Am 05.12.2015 um 01:21 schrieb Rajil Saraswat:
> Hello, > > We have a Rocks cluster with two nodes each with 1 GPU and 32 CPUS. I > have defined a separate queue (gpu.q) containing these two hosts > compute-4-0 and compute-4-1. > > Since we have only 1 gpu per node, i want the first job to be run on > compute-4-0 and the second job to be run on compute-4-1. However, with > my current configuration the second submitted job also tries to > execute on compute-4-0. How can i force the job to execute on > compute-4-1 if compute-4-0 is already running something, and > subsequent submitted jobs should end up in 'qw'? A GPU job besides a conventional job or two GPU jobs at the same time? > My config is as follows: > > #qconf -sq gpu.q > qname gpu.q > hostlist @gpuhosts > load_thresholds np_load_avg=1.75 > processors UNDEFINED > slots 1,[compute-4-0.local=32],[compute-4-1.local=32] > complex_values NONE > ---------------------------------------- > #qconf -sc > gpu gpu INT <= YES YES 0 0 > ---------------------------------------- > #qconf -se compute-4-0.local > hostname compute-4-0.local > load_scaling NONE > complex_values gpu=1 As gpu is set to one here, I wonder how a second GPU job can start here. What is the output of: $ qhost -F gpu in this case? -- Reuti > load_values arch=linux-x64,num_proc=32,mem_total=129091.453125M, \ > swap_total=999.996094M,virtual_total=130091.449219M, \ > load_avg=0.970000,load_short=0.990000, \ > load_medium=0.970000,load_long=0.910000, \ > mem_free=104392.453125M,swap_free=999.996094M, \ > virtual_free=105392.449219M,mem_used=24699.000000M, \ > swap_used=0.000000M,virtual_used=24699.000000M, \ cpu=3.200000, \ > m_topology=SCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCC, \ > m_topology_inuse=SCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCC, \ > m_socket=2,m_core=32,np_load_avg=0.030312, \ > np_load_short=0.030937,np_load_medium=0.030312, \ > np_load_long=0.028438 > processors 32 > ----------------------------------------------- > #qconf -shgrp @gpuhosts > group_name @gpuhosts > hostlist compute-4-0.local compute-4-1.local > > > Thanks, > Rajil > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
