Hi Reuti, Yes, this issue is only with parallel jobs. Regular jobs are being distributed fine as per load.
Thanks On Thu, Mar 20, 2014 at 12:27 PM, Reuti <[email protected]> wrote: > Hi, > > Am 20.03.2014 um 18:15 schrieb Karun K: > > > Our scheduler is configured to "least_used_host" policy depending on > load average and for PE environments its $pe_slots > > Regular jobs are being allocated as expected but PE jobs are being > filled up before it moves to next available node. > > How can I configure PE jobs also to be round-robin? i.e all requested > slots in PE jobs have to be in the same host but jobs should be distributed > rather than filling up host. > > Is this issue limited to parallel jobs? For $pe_slots it should work, and > I see you already defined a "job_load_adjustments". > > -- Reuti > > > > Included our ge configs below , version 2011.11p1 > > > > Thanks, > > Karun > > > > job-ID prior name user state submit/start at queue > slots ja-task-ID > > > ----------------------------------------------------------------------------------------------------------------- > > 124688 0.51929 STDIN kk r 03/13/2014 23:07:57 > [email protected] 2 > > 124689 0.51929 STDIN kk r 03/13/2014 23:07:57 > [email protected] 2 > > 124690 0.51929 STDIN kk r 03/13/2014 23:07:57 > [email protected] 2 > > 124691 0.51929 STDIN kk r 03/13/2014 23:08:02 > [email protected] 2 > > 124692 0.51929 STDIN kk r 03/13/2014 23:08:02 > [email protected] 2 > > 124694 0.50500 STDIN kk r 03/13/2014 23:08:27 > [email protected] 1 > > 124695 0.50500 STDIN kk r 03/13/2014 23:08:27 > [email protected] 1 > > 124696 0.50500 STDIN kk r 03/13/2014 23:08:27 > [email protected] 1 > > 124697 0.50500 STDIN kk r 03/13/2014 23:08:27 > [email protected] 1 > > > > [root@cluster ~]# qconf -ssconf > > algorithm default > > schedule_interval 0:0:05 > > maxujobs 0 > > queue_sort_method load > > job_load_adjustments np_load_avg=3.0 > > load_adjustment_decay_time 0:7:30 > > load_formula np_load_avg > > schedd_job_info true > > flush_submit_sec 0 > > flush_finish_sec 0 > > params none > > reprioritize_interval 0:0:0 > > halftime 168 > > usage_weight_list cpu=1.000000,mem=0.000000,io=0.000000 > > > > ---- > > > > [root@cluster ~]# qconf -sp threaded > > pe_name threaded > > slots 9999 > > user_lists NONE > > xuser_lists NONE > > start_proc_args /bin/true > > stop_proc_args /bin/true > > allocation_rule $pe_slots > > control_slaves FALSE > > job_is_first_task TRUE > > urgency_slots min > > accounting_summary FALSE > > > > All nodes have identical complex configuration > > > > [root@cluster ~]# qconf -se compute-4-3 > > hostname compute-4-3.local > > load_scaling NONE > > complex_values slots=30,h_vmem=120G > > load_values > arch=linux-x64,num_proc=30,mem_total=123136.023438M, \ > > -------------------truncated----------------------- > > processors 30 > > user_lists NONE > > xuser_lists NONE > > projects NONE > > xprojects NONE > > usage_scaling NONE > > report_variables NONE > > > > > > _______________________________________________ > > users mailing list > > [email protected] > > https://gridengine.org/mailman/listinfo/users > >
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
