Hi Nick, 

Thanks for the feedback. Responses below: 

> On 06 Jul 2016, at 15:30, Nick Papior <[email protected]> wrote:
> 
> Hi Arthur,
> 
> 2016-07-06 15:13 GMT+02:00 Arthur France-Lanord <[email protected] 
> <mailto:[email protected]>>:
> Dear all,
> 
> I’ve recently compiled a parallel version of siesta 4.0b-485 using openmpi 
> 1.10.3, and I’m seeing horrible parallel performances. I was previously 
> building siesta with  openmpi 1.6.x (on another box though) and clearly 
> didn’t get the same behaviour. For instance, when running Tests/si001, here 
> is the timing I get when using mpirun -np 1 (serial):
> 
> Start of run             0.000
> -------------- end of scf step             1.384
> -------------- end of scf step             1.589
> -------------- end of scf step             1.782
> -------------- end of scf step             1.987
> -------------- end of scf step             2.191
> etc
> 
> And when using mpirun -np 2 (parallel):
> 
> Start of run             0.000
> -------------- end of scf step            17.349
> -------------- end of scf step            32.758
> -------------- end of scf step            48.386
> -------------- end of scf step            64.036
> -------------- end of scf step            80.265
> etc
> 
> I’ve used manually compiled from sources openblas 0.2.18, lapack 3.4.2, 
> openmpi 1.10.3, and scalapack 2.0.2. I’m also using manually compiled gcc 
> 4.8.3. I did run some scalapack tests, the ones included under 
> scalapack-2.0.2/BLACS/TESTING and scalapack-2.0.2/TESTING, and things were 
> apparently going well (even if it’s hard to have a feeling on the timing).
> 
> Here are chunks of my arch.make which could be relevant:
> 
> FC=/home/afl/local/opt/openmpi/1.10.3/bin/mpif90
> FFLAGS=-O2
> FPPFLAGS= -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
> 
> BLAS_LIBS=/home/afl/local/opt/openblas/lib/libopenblas.so
> LAPACK_LIBS=/home/afl/local/opt/lapack/lib/liblapack.so
> Change these to 
> BLAS_PATH=/home/afl/local/opt/openblas/lib/
> BLAS_LIBS=-L$(BLAS_PATH) -Wl,-rpath=$(BLAS_PATH) -lopenblas
> LAPACK_PATH=/home/afl/local/opt/lapack/lib/
> LAPACK_LIBS=-L$(LAPACK_PATH) -Wl,-rpath=$(LAPACK_PATH) -llapack
> 
> Note, Wl,-rpath only influences shared libraries, so it may not be needed for 
> lapack.

Thanks, I’ll try that. 

> 
> On a side note, please update to LAPACK 3.6.1 and also, LAPACK defaults to 
> compiling the static library (.a), are you sure you have the shared variant?

Yes, I’ve manually compiled shared libraries. Starting from the static variant, 
if I recall correctly I did: 

ar -x liblapack.a
/home/afl/local/opt/gcc/4.8.3/bin/gfortran -o liblapack.so *.o -lopenblas 
-shared -lpthread

> 
> If you do not force the libraries to be statically linked, then you need to 
> update LD_LIBRARY_PATH (which it does not seem you have).
> 
> 
> BLACS_LIBS=
> SCALAPACK_LIBS=-L/home/afl/local/opt/scalapack/lib -lscalapack 
> Change this to 
> SCALAPACK_PATH=/home/afl/local/opt/scalapack/lib
> SCALAPACK_LIBS=-L$(SCALAPACK_PATH) -Wl,rpath=$(SCALAPACK_PATH) -lscalapack
> 
> Again, scalapack is most likely in static (.a), so the Wl,-rpath is most 
> likely superfluous (if you use scalapack as a static library).

I’ve also manually compiled a shared variant of the scalapack lib - I can try 
with both. 

> 
> LIBS=$(SCALAPACK_LIBS) $(LAPACK_LIBS) $(BLAS_LIBS)
> 
> MPI_INTERFACE=libmpi_f90.a
> MPI_INCLUDE=/home/afl/local/opt/openmpi/1.10.3/include/
> DEFS_MPI = -DMPI
> 
> Any ideas? Experience on building scalapack with a recent version of openmpi? 
> Am I obviously doing something wrong?
> You are linking libraries with fixed paths. This is very easy to get wrong. 
> At least you need to explicitly tell us how your environment is setup (i.e. 
> PATH, LD_LIBRARY_PATH).

Yes, I’d like not to be doing that either, but this is some kind of a special 
(not to say cumbersome) setup. I’m using CentOS 6.5 with Rocks, with gcc 4.4.7 
and openmpi 1.6.2 installed at the root level. Unfortunately, siesta’s 
compilation fails with this version of gcc (gfortran actually), something 
related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40996 
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40996> . Since I don’t want to 
change anything at the root level on the machine I’m using, I’ve compiled at 
the user level gcc 4.8.3 and openmpi 1.10.3 (wrapping gcc 4.8.3), without 
including the resulting binaries in $PATH in order to avoid confusion. However 
I currently have two directories containing different scalapack libraries in my 
$LD_LIBRARY_PATH, as well as two different library directories for openmpi 
libs. I’ll clean that up.  


> Also, you explicitly set the mpif90 compiler (full path), but you seem to run 
> using the generic mpirun, do you have any other mpirun's in your env...
> 

Actually I’m using full path to the mpirun bin (I just wrote mpirun to 
simplify). 

> To assert that you have linked correctly, use ldd <exec> to check the actual 
> used libraries.

Right, I forgot to do that. Here is what I’m getting: 

        linux-vdso.so.1 =>  (0x00007fff87d87000)
        /home/afl/local/opt/lapack/lib/liblapack.so (0x00002ad2b49bf000)
        libopenblas.so.0 => /home/afl/local/opt/Openblas/lib/libopenblas.so.0 
(0x00002ad2b53a4000)
        libmpi_usempi.so.5 => 
/home/afl/local/opt/openmpi/1.10.0/lib/libmpi_usempi.so.5 (0x00002ad2b593c000)
        libmpi_mpifh.so.12 => 
/home/afl/local/opt/openmpi/1.10.0/lib/libmpi_mpifh.so.12 (0x00002ad2b5b3f000)
        libmpi.so.12 => /home/afl/local/opt/openmpi/1.10.0/lib/libmpi.so.12 
(0x00002ad2b5d91000)
        libgfortran.so.3 => 
/home/afl/local/opt/gcc/4.8.3/lib64/libgfortran.so.3 (0x00002ad2b629c000)
        libm.so.6 => /lib64/libm.so.6 (0x000000327ba00000)
        libgcc_s.so.1 => /home/afl/local/opt/gcc/4.8.3/lib64/libgcc_s.so.1 
(0x00002ad2b65c8000)
        libquadmath.so.0 => 
/home/afl/local/opt/gcc/4.8.3/lib64/libquadmath.so.0 (0x00002ad2b67de000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x000000327c200000)
        libc.so.6 => /lib64/libc.so.6 (0x000000327b600000)
        libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x000000327c600000)
        libopen-rte.so.12 => 
/home/afl/local/opt/openmpi/1.10.0/lib/libopen-rte.so.12 (0x00002ad2b6a1b000)
        libtorque.so.2 => /opt/torque/lib/libtorque.so.2 (0x00002ad2b6d05000)
        libopen-pal.so.13 => 
/home/afl/local/opt/openmpi/1.10.0/lib/libopen-pal.so.13 (0x00002ad2b7008000)
        libdl.so.2 => /lib64/libdl.so.2 (0x000000327be00000)
        librt.so.1 => /lib64/librt.so.1 (0x000000327ca00000)
        libutil.so.1 => /lib64/libutil.so.1 (0x0000003289e00000)
        /lib64/ld-linux-x86-64.so.2 (0x000000327b200000)

There’s no trace of libscalapack, obviously I’m messing up at some point! 

thanks, 
arthur


> 
> thanks,
> arthur
> 
> 
> 
> 
> 
> -- 
> Kind regards Nick

Responder a