So I want to thank you so much! My benchmark for my actual application
went from 5052 seconds to 266 seconds with this simple fix!
Ron
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Wed, Mar 23, 2016 at 11:00 AM, Ronald Cohen wrote:
> Dear Gilles,
>
> --with-tm f
I don't have any parameters set other than the defaults--thank you!
Ron
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3
On Wed, Mar 23, 2016 at 11:07 AM, Edgar Gabriel wrote:
> not sure whether it is relevant in this case, but I spent in January nearly
> one week to
not sure whether it is relevant in this case, but I spent in January
nearly one week to figure out why the openib component was running very
slow with the new Open MPI releases (though it was the 2.x series at
that time), and the culprit turned out to be the
btl_openib_flags parameter. I used to
Ronald,
out of curiosity, what kind of performance do you get with tcp and two
nodes ?
e.g.
mpirun --mca tcp,vader,self ...
before that, you can
mpirun uptime
to ensure all your nodes are free
(e.g. no process was left running by an other job)
you might also want to allocate your nodes exclusive
Dear Gilles,
--with-tm fails. I have now built with
./configure --prefix=/home/rcohen --with-tm=/opt/torque
make clean
make -j 8
make install
This rebuilt greatly improved performance, from 1 GF to 32 GF for 2
nodes for a 2000 size matrix. For 5000 it went up to 108. So this
sounds pretty good.
Ronald,
first, can you make sure tm was built ?
the easiest way us to
configure --with-tm ...
it will crash if tm is not found
if pbs/torque is not installed in a standard location, then you have to
configure --with-tm=
then you can omit -hostfile from your mpirun command line
hpl is known to sc
The configure line was simply:
./configure --prefix=/home/rcohen
when I run:
mpirun --mca btl self,vader,openib ...
I get the same lousy results: 1.5 GFLOPS
The output of the grep is:
Cpus_allowed_list: 0-7
Cpus_allowed_list: 8-15
Cpus_allowed_list: 0-7
Cpus_allowed_list:
I have tried:
mpirun --mca btl openib,self -hostfile $PBS_NODEFILE -n 16 xhpl > xhpl.out
and
mpirun -hostfile $PBS_NODEFILE -n 16 xhpl > xhpl.out
How do I run "sanity checks, like OSU latency and bandwidth benchmarks
between the nodes?" I am not superuser. Thanks,
Ron
---
Ron Cohen
recoh.
Ronald,
the fix I mentioned landed into the v1.10 branch
https://github.com/open-mpi/ompi-release/commit/c376994b81030cfa380c29d5b8f60c3e53d3df62
can you please post your configure command line ?
you can also try to
mpirun --mca btl self,vader,openib ...
to make sure your run will abort instead
Hi, Ron
Please include the command line you used in your tests. Have you run any
sanity checks, like OSU latency and bandwidth benchmarks between the nodes?
Josh
On Wed, Mar 23, 2016 at 8:47 AM, Ronald Cohen wrote:
> Thank you! Here are the answers:
>
> I did not try a previous release of gcc
Thank you! Here are the answers:
I did not try a previous release of gcc.
I built from a tarball.
What should I do about the iirc issue--how should I check?
Are there any flags I should be using for infiniband? Is this a
problem with latency?
Ron
---
Ron Cohen
recoh...@gmail.com
skypename: ron
Ronald,
did you try to build openmpi with a previous gcc release ?
if yes, what about the performance ?
did you build openmpi from a tarball or from git ?
if from git and without VPATH, then you need to
configure with --disable-debug
iirc, one issue was identified previously
(gcc optimization th
Attached is the output of ompi_info --all .
Note that the message :
Fort use mpi_f08: yes
Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
limitations in the gfortran compiler, does not support the following:
array subsections, direct passthru (where possible) to under
I get 100 GFLOPS for 16 cores on one node, but 1 GFLOP running 8 cores
on two nodes. It seems that quad-infiniband should do better than
this. I built openmpi-1.10.2g with gcc version 6.0.0 20160317 . Any
ideas of what to do to get usable performance? Thank you!
bstatus
Infiniband device 'mlx4_0'
14 matches
Mail list logo