[gridengine users] cannot run in PE ... because it only offers 0 slots

2011-11-18 Thread Gerard Henry

hello all,

i got trouble to confgure a queue on SGE 6.2u5 (linux)

I have two machines amd64, with this topology: SCCSCC so the total of 
cores is 8.


first, i defined a group:
# qconf -shgrp @qlong
group_name @qlong
hostlist charybde scylla

then a queue:
# qconf -sq long1
qname long1
hostlist  @qlong
seq_no0
load_thresholds   np_load_avg=1.75
suspend_thresholdsNONE
nsuspend  1
suspend_interval  00:05:00
priority  0
min_cpu_interval  00:05:00
processorsUNDEFINED
qtype BATCH INTERACTIVE
ckpt_list NONE
pe_list   make
rerun FALSE
slots 4
tmpdir/tmp
shell /bin/csh
prologNONE
epilogNONE
shell_start_mode  posix_compliant
starter_methodNONE
suspend_methodNONE
resume_method NONE
terminate_method  NONE
notify00:00:60
owner_listNONE
user_listsNONE
xuser_lists   NONE
subordinate_list  NONE
complex_valuesNONE
projects  NONE
xprojects NONE
calendar  NONE
initial_state default
s_rt  INFINITY
h_rt  INFINITY
s_cpu INFINITY
h_cpu INFINITY
s_fsize   INFINITY
h_fsize   INFINITY
s_dataINFINITY
h_dataINFINITY
s_stack   INFINITY
h_stack   INFINITY
s_coreINFINITY
h_coreINFINITY
s_rss INFINITY
h_rss INFINITY
s_vmemINFINITY
h_vmemINFINITY

but when i try to submit a job, it fails with:
% qsub -w v ./script1.sh
Job 14431 cannot run in PE mpi_labo because it only offers 0 slots

the beginning of the script is:
...
#$ -q long1
#$ -pe mpi_labo 6


and the PE is defined by:
qconf -sp mpi_labo
pe_namempi_labo
slots  8
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$pe_slots
control_slaves TRUE
job_is_first_task  FALSE
urgency_slots  min
accounting_summary FALSE


If i try to submit with -pe mpi_labo 4, it works. What am i missing?

I also tried to augment the value:
qconf -mq long1
slots 8
but in this case, the program executes his 8 threads on the same host, 
that's not what i want;


thanks in advance for help,

gerard


___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] cannot run in PE ... because it only offers 0 slots

2011-11-18 Thread Chris Dagdigian


Check the value of pe_list in your queue configuration. The MPI PE you 
are trying to use is not listed in the pe_list parameter for the queue 
you are submitting to.  The queue you show only has make as a 
supported PE.


-Chris


Gerard Henry wrote:

hello all,

i got trouble to confgure a queue on SGE 6.2u5 (linux)

I have two machines amd64, with this topology: SCCSCC so the total of
cores is 8.

first, i defined a group:
# qconf -shgrp @qlong
group_name @qlong
hostlist charybde scylla

then a queue:
# qconf -sq long1
qname long1
hostlist @qlong
seq_no 0
load_thresholds np_load_avg=1.75
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
processors UNDEFINED
qtype BATCH INTERACTIVE
ckpt_list NONE
pe_list make
rerun FALSE
slots 4
tmpdir /tmp
shell /bin/csh
prolog NONE
epilog NONE
shell_start_mode posix_compliant
starter_method NONE
suspend_method NONE
resume_method NONE
terminate_method NONE
notify 00:00:60
owner_list NONE
user_lists NONE
xuser_lists NONE
subordinate_list NONE
complex_values NONE
projects NONE
xprojects NONE
calendar NONE
initial_state default
s_rt INFINITY
h_rt INFINITY
s_cpu INFINITY
h_cpu INFINITY
s_fsize INFINITY
h_fsize INFINITY
s_data INFINITY
h_data INFINITY
s_stack INFINITY
h_stack INFINITY
s_core INFINITY
h_core INFINITY
s_rss INFINITY
h_rss INFINITY
s_vmem INFINITY
h_vmem INFINITY

but when i try to submit a job, it fails with:
% qsub -w v ./script1.sh
Job 14431 cannot run in PE mpi_labo because it only offers 0 slots

the beginning of the script is:
...
#$ -q long1
#$ -pe mpi_labo 6


and the PE is defined by:
qconf -sp mpi_labo
pe_name mpi_labo
slots 8
user_lists NONE
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $pe_slots
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
accounting_summary FALSE


If i try to submit with -pe mpi_labo 4, it works. What am i missing?

I also tried to augment the value:
qconf -mq long1
slots 8
but in this case, the program executes his 8 threads on the same host,
that's not what i want;

thanks in advance for help,

gerard


___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] cannot run in PE ... because it only offers 0 slots

2011-11-18 Thread Gerard Henry

sorry, a typo...
yes long1 contains:
pe_list   make mpi_labo


On 11/18/11 03:24 PM, Chris Dagdigian wrote:


Check the value of pe_list in your queue configuration. The MPI PE you
are trying to use is not listed in the pe_list parameter for the queue
you are submitting to. The queue you show only has make as a supported
PE.

-Chris


Gerard Henry wrote:

hello all,

i got trouble to confgure a queue on SGE 6.2u5 (linux)

I have two machines amd64, with this topology: SCCSCC so the total of
cores is 8.

first, i defined a group:
# qconf -shgrp @qlong
group_name @qlong
hostlist charybde scylla

then a queue:
# qconf -sq long1
qname long1
hostlist @qlong
seq_no 0
load_thresholds np_load_avg=1.75
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
processors UNDEFINED
qtype BATCH INTERACTIVE
ckpt_list NONE
pe_list make
rerun FALSE
slots 4
tmpdir /tmp
shell /bin/csh
prolog NONE
epilog NONE
shell_start_mode posix_compliant
starter_method NONE
suspend_method NONE
resume_method NONE
terminate_method NONE
notify 00:00:60
owner_list NONE
user_lists NONE
xuser_lists NONE
subordinate_list NONE
complex_values NONE
projects NONE
xprojects NONE
calendar NONE
initial_state default
s_rt INFINITY
h_rt INFINITY
s_cpu INFINITY
h_cpu INFINITY
s_fsize INFINITY
h_fsize INFINITY
s_data INFINITY
h_data INFINITY
s_stack INFINITY
h_stack INFINITY
s_core INFINITY
h_core INFINITY
s_rss INFINITY
h_rss INFINITY
s_vmem INFINITY
h_vmem INFINITY

but when i try to submit a job, it fails with:
% qsub -w v ./script1.sh
Job 14431 cannot run in PE mpi_labo because it only offers 0 slots

the beginning of the script is:
...
#$ -q long1
#$ -pe mpi_labo 6


and the PE is defined by:
qconf -sp mpi_labo
pe_name mpi_labo
slots 8
user_lists NONE
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $pe_slots
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
accounting_summary FALSE


If i try to submit with -pe mpi_labo 4, it works. What am i missing?

I also tried to augment the value:
qconf -mq long1
slots 8
but in this case, the program executes his 8 threads on the same host,
that's not what i want;

thanks in advance for help,

gerard


___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] cannot run in PE ... because it only offers 0 slots

2011-11-18 Thread Reuti
Hi,

Am 18.11.2011 um 15:21 schrieb Gerard Henry:

 hello all,
 
 i got trouble to confgure a queue on SGE 6.2u5 (linux)
 
 I have two machines amd64, with this topology: SCCSCC so the total of cores 
 is 8.
 snip
 
 and the PE is defined by:
 qconf -sp mpi_labo
 pe_namempi_labo
 slots  8
 user_lists NONE
 xuser_listsNONE
 start_proc_args/bin/true
 stop_proc_args /bin/true
 allocation_rule$pe_slots

$pe_slots means that all slots must be allocated on one and the same machine. 
You can try $round_robin or $fill_up (man sge_pe) to get slots from different 
machine.

But: Using threads (like Open MP) needs an SMP machine. So, running 8 threads 
in SMP isn't possible in your configuration as you have only 4 per machine. You 
will need for example an MPI library to compute across machines.

-- Reuti


 control_slaves TRUE
 job_is_first_task  FALSE
 urgency_slots  min
 accounting_summary FALSE
 
 
 If i try to submit with -pe mpi_labo 4, it works. What am i missing?
 
 I also tried to augment the value:
 qconf -mq long1
 slots 8
 but in this case, the program executes his 8 threads on the same host, that's 
 not what i want;
 
 thanks in advance for help,
 
 gerard
 
 
 ___
 users mailing list
 users@gridengine.org
 https://gridengine.org/mailman/listinfo/users


___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] cannot run in PE ... because it only offers 0 slots

2011-11-18 Thread Gerard Henry

On 11/18/11 03:31 PM, Reuti wrote:

Hi,

Am 18.11.2011 um 15:21 schrieb Gerard Henry:


hello all,

i got trouble to confgure a queue on SGE 6.2u5 (linux)

I have two machines amd64, with this topology: SCCSCC so the total of cores is 
8.
snip

and the PE is defined by:
qconf -sp mpi_labo
pe_namempi_labo
slots  8
user_lists NONE
xuser_listsNONE
start_proc_args/bin/true
stop_proc_args /bin/true
allocation_rule$pe_slots


$pe_slots means that all slots must be allocated on one and the same machine. 
You can try $round_robin or $fill_up (man sge_pe) to get slots from different 
machine.

But: Using threads (like Open MP) needs an SMP machine. So, running 8 threads 
in SMP isn't possible in your configuration as you have only 4 per machine. You 
will need for example an MPI library to compute across machines.




exactly what i missed! very thanks!
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users