Hello,
i have a strange effect, where i am not sure whether it is "only" a
misconfiguration or a bug.
First: I run son of gridengine 8.1.9-1.el6.x86_64 (i installed the rhel
rpm on an opensuse 13.1 machine. This should not matter in this case,
and it is reported to be able to run on opensuse).
mpirun and mpiexec are from openmpi-1.10.3 (no other mpi was installed,
neither on master, nor on slaves). The installation was made with:
./configure --prefix=`pwd`/build --disable-dlopen --disable-mca-dso
--with-orte --with-sge --with-x --enable-mpi-thread-multiple
--enable-orterun-prefix-by-default --enable-mpirun-prefix-by-default
--enable-orte-static-ports --enable-mpi-cxx --enable-mpi-cxx-seek
--enable-oshmem --enable-java --enable-mpi-java
make
make install
I attached the outputs of 'qconf -ap all.q' , 'qconf -sconf' and 'qconf
-sp orte' as textfiles.
Now my problem:
I asked for 20 cores and if i run qstat -u '*' it shows that this job
is being run in slave07 using 20 cores but is not true! if i run qstat
-f -u '*' i see that this job is only using 3 cores in salve07 and
there are 17 cores in other nodes allocated to this job which are in fact
unused!
Or other example:
My job took say 6 cpus on slave07 and 14 on slave06 but nothing was
running on 06 and therefore a waste of ressource on 06 and overload on
07 becomes highly possible (the numbers are made up).
If i ran 1 Cpus in many independent jobs that would not be an issue, but
imagine i now request 60 cpus on slave07, that would seriously overload
the node in many cases.
Or other example:
if i ask for say 50 CPUs, the job will start on one node, e.g,
slave01, but only reserving say 15 CPUs out of 64 and reserve the rest
on many other nodes (obviously wasting space doing nothing).
This has the bad consequence of allocating many more CPUs than available
when many jobs are running, imagine you have 10 jobs like this one...
some nodes will run maybe 3 even if they only have 24 CPUs...
I hope that i have made clear what the issue is.
I also see that the `qstat` and `qstat -f` are in disagreement. The
latter is correct, i checked the processes running on the nodes.
Did somebody already encounter such a problem? Does somebody have an
idea where to look into or what to test?
With kind regards, ulrich
HOSTNAME ARCH NCPU NSOC NCOR NTHR LOAD MEMTOT MEMUSE
SWAPTO SWAPUS
----------------------------------------------------------------------------------------------
global - - - - - - - -
- -
slave01 lx-amd64 64 4 64 64 0.01 504.9G 2.3G
10.0G 0.0
slave02 lx-amd64 64 4 64 64 123.0 504.8G 171.7G
10.0G 0.0
slave03 lx-amd64 64 4 64 64 0.01 504.9G 2.0G
10.0G 0.0
slave04 lx-amd64 64 4 64 64 0.01 504.9G 2.0G
10.0G 0.0
slave05 lx-amd64 64 4 64 64 2.01 504.9G 40.2G
10.0G 0.0
slave06 lx-amd64 40 4 40 40 0.01 314.8G 1.1G
32.0G 0.0
slave07 lx-amd64 24 2 24 24 166.9 188.8G 188.0G
30.0G 28.7G
slave08 lx-amd64 24 2 24 24 0.01 188.8G 660.5M
30.0G 7.5M
#global:
execd_spool_dir /opt/sge/default/spool
mailer /bin/mail
xterm /usr/bin/xterm
load_sensor none
prolog none
epilog none
shell_start_mode posix_compliant
login_shells sh,bash,ksh,csh,tcsh
min_uid 100
min_gid 100
user_lists none
xuser_lists none
projects none
xprojects none
enforce_project false
enforce_user auto
load_report_time 00:00:40
max_unheard 00:05:00
reschedule_unknown 00:00:00
loglevel log_warning
administrator_mail [email protected]
set_token_cmd none
pag_cmd none
token_extend_time none
shepherd_cmd none
qmaster_params none
execd_params none
reporting_params accounting=true reporting=false \
flush_time=00:00:15 joblog=false sharelog=00:00:00
finished_jobs 100
gid_range 20000-20100
qlogin_command builtin
qlogin_daemon builtin
rlogin_command builtin
rlogin_daemon builtin
rsh_command builtin
rsh_daemon builtin
max_aj_instances 2000
max_aj_tasks 75000
max_u_jobs 0
max_jobs 0
max_advance_reservations 0
auto_user_oticket 0
auto_user_fshare 0
auto_user_default_project none
auto_user_delete_time 86400
delegated_file_staging false
reprioritize 0
jsv_url none
jsv_allowed_mod ac,h,i,e,o,j,M,N,p,w
pe_name orte
slots 408
user_lists NONE
xuser_lists NONE
start_proc_args NONE
stop_proc_args NONE
allocation_rule $round_robin
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
accounting_summary FALSE
qsort_args NONE
qname all.q
hostlist @allhosts
seq_no 0
load_thresholds np_load_avg=8.75
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
processors UNDEFINED
qtype BATCH INTERACTIVE
ckpt_list NONE
pe_list make smp mpi orte
rerun FALSE
slots 1,[slave01=64],[slave02=64],[slave03=64],[slave04=64], \
[slave05=64],[slave06=40],[slave07=24],[slave08=24]
tmpdir /tmp
shell /bin/sh
prolog NONE
epilog NONE
shell_start_mode posix_compliant
starter_method NONE
suspend_method NONE
resume_method NONE
terminate_method NONE
notify 00:00:60
owner_list NONE
user_lists NONE
xuser_lists NONE
subordinate_list NONE
complex_values NONE
projects NONE
xprojects NONE
calendar NONE
initial_state default
s_rt INFINITY
h_rt INFINITY
s_cpu INFINITY
h_cpu INFINITY
s_fsize INFINITY
h_fsize INFINITY
s_data INFINITY
h_data INFINITY
s_stack INFINITY
h_stack INFINITY
s_core INFINITY
h_core INFINITY
s_rss INFINITY
h_rss INFINITY
s_vmem INFINITY
h_vmem INFINITY
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users