Re: [gridengine users] jobs allocate cores on node but do nothing

Reuti Fri, 12 Aug 2016 10:13:53 -0700

Hi,

> Am 12.08.2016 um 18:48 schrieb Ulrich Hiller <hil...@mpia-hd.mpg.de>:
> 
> Hello,
> 
> i have a strange effect, where i am not sure whether it is "only" a
> misconfiguration or a bug.
> 
> First: I run son of gridengine 8.1.9-1.el6.x86_64 (i installed the rhel
> rpm on an opensuse 13.1 machine. This should not matter in this case,
> and it is reported to be able to run on opensuse).
> 
> mpirun and mpiexec are from openmpi-1.10.3 (no other mpi was installed,
> neither on master, nor on slaves). The installation was made with:
> ./configure --prefix=`pwd`/build --disable-dlopen --disable-mca-dso
> --with-orte --with-sge --with-x --enable-mpi-thread-multiple
> --enable-orterun-prefix-by-default --enable-mpirun-prefix-by-default
> --enable-orte-static-ports --enable-mpi-cxx --enable-mpi-cxx-seek
> --enable-oshmem --enable-java --enable-mpi-java
> make
> make install
> 
> I attached the outputs of 'qconf -ap all.q' , 'qconf -sconf' and 'qconf
> -sp orte' as textfiles.
> 
> Now my problem:
> I asked for 20 cores and if i run qstat -u '*' it shows that this job
> is being run in slave07 using 20 cores but is not true! if i run qstat
> -f -u '*' i see that this job is only using 3 cores in salve07 and
> there are 17 cores in other nodes allocated to this job which are in fact
> unused!


qstat will list only the master node of the parallel job and the number of 
overall slots. The granted allocation you can check with:

$ qstat -g t -u '*'

The other issue seems to be, that in fact your job is using only one machine, 
which means that it is essentially ignoring any granted slot allocation. While 
the job is running, can you please execute on the master node of the parallel 
job:

$ ps -e f

(f w/o -) and post the relevant lines belonging to either sge_execd or just 
running as kids of the init process, in case they jumped out of the process 
tree. Maybe a good start would be to execute something like `mpiexec sleep 300` 
in the jobscript.

Next step could be a `mpihello.c` where you put an almost endless loop inside 
and switch off all optimizations during compilations to check whether these 
slave processes are distributed in the correct way.

Note that some applications will check the number of cores they are running on 
and start by OpenMP (not Open MPI) as many threads as cores are found. Could 
this be the case for your application too?

-- Reuti


> Or other example:
> My job took say 6 cpus on slave07 and 14 on slave06 but nothing was
> running on 06 and therefore a waste of ressource on 06 and overload on
> 07 becomes highly possible (the numbers are made up).
> If i ran 1 Cpus in many independent jobs that would not be an issue, but
> imagine i now request 60 cpus on slave07, that would seriously overload
> the node in many cases.
> 
> Or other example:
> if i ask for say 50 CPUs, the job will start on one node, e.g,
> slave01,  but only reserving say 15 CPUs out of 64 and reserve the rest
> on many other nodes (obviously wasting space doing nothing).
> This has the bad consequence of allocating many more CPUs than available
> when many jobs are running, imagine you have 10 jobs like this one...
> some nodes will run maybe 3 even if they only have 24 CPUs...
> 
> I hope that i have made clear what the issue is.
> 
> I also see that the `qstat` and `qstat -f` are in disagreement. The
> latter is correct, i checked the processes running on the nodes.
> 
> 
> Did somebody already encounter such a problem? Does somebody have an
> idea where to look into or what to test?
> 
> With kind regards, ulrich
> 
> 
> 
> <qhost.txt><qconf-sconf.txt><qconf-mp-orte.txt><qconf-all.q>_______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] jobs allocate cores on node but do nothing

Reply via email to