[gridengine users] cannot run in PE ... because it only offers 0 slots
hello all, i got trouble to confgure a queue on SGE 6.2u5 (linux) I have two machines amd64, with this topology: SCCSCC so the total of cores is 8. first, i defined a group: # qconf -shgrp @qlong group_name @qlong hostlist charybde scylla then a queue: # qconf -sq long1 qname long1 hostlist @qlong seq_no0 load_thresholds np_load_avg=1.75 suspend_thresholdsNONE nsuspend 1 suspend_interval 00:05:00 priority 0 min_cpu_interval 00:05:00 processorsUNDEFINED qtype BATCH INTERACTIVE ckpt_list NONE pe_list make rerun FALSE slots 4 tmpdir/tmp shell /bin/csh prologNONE epilogNONE shell_start_mode posix_compliant starter_methodNONE suspend_methodNONE resume_method NONE terminate_method NONE notify00:00:60 owner_listNONE user_listsNONE xuser_lists NONE subordinate_list NONE complex_valuesNONE projects NONE xprojects NONE calendar NONE initial_state default s_rt INFINITY h_rt INFINITY s_cpu INFINITY h_cpu INFINITY s_fsize INFINITY h_fsize INFINITY s_dataINFINITY h_dataINFINITY s_stack INFINITY h_stack INFINITY s_coreINFINITY h_coreINFINITY s_rss INFINITY h_rss INFINITY s_vmemINFINITY h_vmemINFINITY but when i try to submit a job, it fails with: % qsub -w v ./script1.sh Job 14431 cannot run in PE mpi_labo because it only offers 0 slots the beginning of the script is: ... #$ -q long1 #$ -pe mpi_labo 6 and the PE is defined by: qconf -sp mpi_labo pe_namempi_labo slots 8 user_lists NONE xuser_listsNONE start_proc_args/bin/true stop_proc_args /bin/true allocation_rule$pe_slots control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary FALSE If i try to submit with -pe mpi_labo 4, it works. What am i missing? I also tried to augment the value: qconf -mq long1 slots 8 but in this case, the program executes his 8 threads on the same host, that's not what i want; thanks in advance for help, gerard ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] cannot run in PE ... because it only offers 0 slots
Check the value of pe_list in your queue configuration. The MPI PE you are trying to use is not listed in the pe_list parameter for the queue you are submitting to. The queue you show only has make as a supported PE. -Chris Gerard Henry wrote: hello all, i got trouble to confgure a queue on SGE 6.2u5 (linux) I have two machines amd64, with this topology: SCCSCC so the total of cores is 8. first, i defined a group: # qconf -shgrp @qlong group_name @qlong hostlist charybde scylla then a queue: # qconf -sq long1 qname long1 hostlist @qlong seq_no 0 load_thresholds np_load_avg=1.75 suspend_thresholds NONE nsuspend 1 suspend_interval 00:05:00 priority 0 min_cpu_interval 00:05:00 processors UNDEFINED qtype BATCH INTERACTIVE ckpt_list NONE pe_list make rerun FALSE slots 4 tmpdir /tmp shell /bin/csh prolog NONE epilog NONE shell_start_mode posix_compliant starter_method NONE suspend_method NONE resume_method NONE terminate_method NONE notify 00:00:60 owner_list NONE user_lists NONE xuser_lists NONE subordinate_list NONE complex_values NONE projects NONE xprojects NONE calendar NONE initial_state default s_rt INFINITY h_rt INFINITY s_cpu INFINITY h_cpu INFINITY s_fsize INFINITY h_fsize INFINITY s_data INFINITY h_data INFINITY s_stack INFINITY h_stack INFINITY s_core INFINITY h_core INFINITY s_rss INFINITY h_rss INFINITY s_vmem INFINITY h_vmem INFINITY but when i try to submit a job, it fails with: % qsub -w v ./script1.sh Job 14431 cannot run in PE mpi_labo because it only offers 0 slots the beginning of the script is: ... #$ -q long1 #$ -pe mpi_labo 6 and the PE is defined by: qconf -sp mpi_labo pe_name mpi_labo slots 8 user_lists NONE xuser_lists NONE start_proc_args /bin/true stop_proc_args /bin/true allocation_rule $pe_slots control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary FALSE If i try to submit with -pe mpi_labo 4, it works. What am i missing? I also tried to augment the value: qconf -mq long1 slots 8 but in this case, the program executes his 8 threads on the same host, that's not what i want; thanks in advance for help, gerard ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] cannot run in PE ... because it only offers 0 slots
sorry, a typo... yes long1 contains: pe_list make mpi_labo On 11/18/11 03:24 PM, Chris Dagdigian wrote: Check the value of pe_list in your queue configuration. The MPI PE you are trying to use is not listed in the pe_list parameter for the queue you are submitting to. The queue you show only has make as a supported PE. -Chris Gerard Henry wrote: hello all, i got trouble to confgure a queue on SGE 6.2u5 (linux) I have two machines amd64, with this topology: SCCSCC so the total of cores is 8. first, i defined a group: # qconf -shgrp @qlong group_name @qlong hostlist charybde scylla then a queue: # qconf -sq long1 qname long1 hostlist @qlong seq_no 0 load_thresholds np_load_avg=1.75 suspend_thresholds NONE nsuspend 1 suspend_interval 00:05:00 priority 0 min_cpu_interval 00:05:00 processors UNDEFINED qtype BATCH INTERACTIVE ckpt_list NONE pe_list make rerun FALSE slots 4 tmpdir /tmp shell /bin/csh prolog NONE epilog NONE shell_start_mode posix_compliant starter_method NONE suspend_method NONE resume_method NONE terminate_method NONE notify 00:00:60 owner_list NONE user_lists NONE xuser_lists NONE subordinate_list NONE complex_values NONE projects NONE xprojects NONE calendar NONE initial_state default s_rt INFINITY h_rt INFINITY s_cpu INFINITY h_cpu INFINITY s_fsize INFINITY h_fsize INFINITY s_data INFINITY h_data INFINITY s_stack INFINITY h_stack INFINITY s_core INFINITY h_core INFINITY s_rss INFINITY h_rss INFINITY s_vmem INFINITY h_vmem INFINITY but when i try to submit a job, it fails with: % qsub -w v ./script1.sh Job 14431 cannot run in PE mpi_labo because it only offers 0 slots the beginning of the script is: ... #$ -q long1 #$ -pe mpi_labo 6 and the PE is defined by: qconf -sp mpi_labo pe_name mpi_labo slots 8 user_lists NONE xuser_lists NONE start_proc_args /bin/true stop_proc_args /bin/true allocation_rule $pe_slots control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary FALSE If i try to submit with -pe mpi_labo 4, it works. What am i missing? I also tried to augment the value: qconf -mq long1 slots 8 but in this case, the program executes his 8 threads on the same host, that's not what i want; thanks in advance for help, gerard ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] cannot run in PE ... because it only offers 0 slots
Hi, Am 18.11.2011 um 15:21 schrieb Gerard Henry: hello all, i got trouble to confgure a queue on SGE 6.2u5 (linux) I have two machines amd64, with this topology: SCCSCC so the total of cores is 8. snip and the PE is defined by: qconf -sp mpi_labo pe_namempi_labo slots 8 user_lists NONE xuser_listsNONE start_proc_args/bin/true stop_proc_args /bin/true allocation_rule$pe_slots $pe_slots means that all slots must be allocated on one and the same machine. You can try $round_robin or $fill_up (man sge_pe) to get slots from different machine. But: Using threads (like Open MP) needs an SMP machine. So, running 8 threads in SMP isn't possible in your configuration as you have only 4 per machine. You will need for example an MPI library to compute across machines. -- Reuti control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary FALSE If i try to submit with -pe mpi_labo 4, it works. What am i missing? I also tried to augment the value: qconf -mq long1 slots 8 but in this case, the program executes his 8 threads on the same host, that's not what i want; thanks in advance for help, gerard ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users
Re: [gridengine users] cannot run in PE ... because it only offers 0 slots
On 11/18/11 03:31 PM, Reuti wrote: Hi, Am 18.11.2011 um 15:21 schrieb Gerard Henry: hello all, i got trouble to confgure a queue on SGE 6.2u5 (linux) I have two machines amd64, with this topology: SCCSCC so the total of cores is 8. snip and the PE is defined by: qconf -sp mpi_labo pe_namempi_labo slots 8 user_lists NONE xuser_listsNONE start_proc_args/bin/true stop_proc_args /bin/true allocation_rule$pe_slots $pe_slots means that all slots must be allocated on one and the same machine. You can try $round_robin or $fill_up (man sge_pe) to get slots from different machine. But: Using threads (like Open MP) needs an SMP machine. So, running 8 threads in SMP isn't possible in your configuration as you have only 4 per machine. You will need for example an MPI library to compute across machines. exactly what i missed! very thanks! ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users