Hello all,
we seem to run into the same issue: 'mpif90' sigsegvs immediately for Open MPI 1.10.4 compiled using Intel compilers 16.0.4.258 and 16.0.3.210, while it works fine when compiled with 16.0.2.181.

It seems to be a compiler issue (more exactly: library issue on libs delivered with 16.0.4.258 and 16.0.3.210 versions). Changing the version of compiler loaded back to 16.0.2.181 (=> change of dynamically loaded libs) let the prevously-failing binary (compiled with newer compilers) to work propperly.

Compiling with -O0 does not help. As the issue is likely in the Intel libs (as said changing out these solves/raises the issue) we will do a failback to 16.0.2.181 compiler version. We will try to open a case by Intel - let's see...

Have a nice day,

Paul Kapinos



On 05/06/16 14:10, Jeff Squyres (jsquyres) wrote:
Ok, good.

I asked that question because typically when we see errors like this, it is 
usually either a busted compiler installation or inadvertently mixing the 
run-times of multiple different compilers in some kind of incompatible way.  
Specifically, the mpifort (aka mpif90) application is a fairly simple program 
-- there's no reason it should segv, especially with a stack trace that you 
sent that implies that it's dying early in startup, potentially even before it 
has hit any Open MPI code (i.e., it could even be pre-main).

BTW, you might be able to get a more complete stack trace from the debugger 
that comes with the Intel compiler (idb?  I don't remember offhand).

Since you are able to run simple programs compiled by this compiler, it sounds 
like the compiler is working fine.  Good!

The next thing to check is to see if somehow the compiler and/or run-time 
environments are getting mixed up.  E.g., the apps were compiled for one 
compiler/run-time but are being used with another.  Also ensure that any 
compiler/linker flags that you are passing to Open MPI's configure script are 
native and correct for the platform for which you're compiling (e.g., don't 
pass in flags that optimize for a different platform; that may result in 
generating machine code instructions that are invalid for your platform).

Try recompiling/re-installing Open MPI from scratch, and if it still doesn't 
work, then send all the information listed here:

    https://www.open-mpi.org/community/help/


On May 6, 2016, at 3:45 AM, Giacomo Rossi <giacom...@gmail.com> wrote:

Yes, I've tried three simple "Hello world" programs in fortan, C and C++ and 
the compile and run with intel 16.0.3. The problem is with the openmpi compiled from 
source.

Giacomo Rossi Ph.D., Space Engineer

Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" 
University of Rome
p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com

Member of Fortran-FOSS-programmers


2016-05-05 11:15 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>:
 gdb /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90
GNU gdb (GDB) 7.11
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90...(no 
debugging symbols found)...done.
(gdb) r -v
Starting program: /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6858f38 in ?? ()
(gdb) bt
#0  0x00007ffff6858f38 in ?? ()
#1  0x00007ffff7de5828 in _dl_relocate_object () from 
/lib64/ld-linux-x86-64.so.2
#2  0x00007ffff7ddcfa3 in dl_main () from /lib64/ld-linux-x86-64.so.2
#3  0x00007ffff7df029c in _dl_sysdep_start () from /lib64/ld-linux-x86-64.so.2
#4  0x00007ffff7dddd4a in _dl_start () from /lib64/ld-linux-x86-64.so.2
#5  0x00007ffff7dd9d98 in _start () from /lib64/ld-linux-x86-64.so.2
#6  0x0000000000000002 in ?? ()
#7  0x00007fffffffaa8a in ?? ()
#8  0x00007fffffffaab6 in ?? ()
#9  0x0000000000000000 in ?? ()

Giacomo Rossi Ph.D., Space Engineer

Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" 
University of Rome
p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com

Member of Fortran-FOSS-programmers


2016-05-05 10:44 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>:
Here the result of ldd command:
'ldd /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90
        linux-vdso.so.1 (0x00007ffcacbbe000)
        libopen-pal.so.13 => 
/opt/openmpi/1.10.2/intel/16.0.3/lib/libopen-pal.so.13 (0x00007fa9597a9000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007fa9594a4000)
        libpciaccess.so.0 => /usr/lib/libpciaccess.so.0 (0x00007fa95929a000)
        libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fa959096000)
        librt.so.1 => /usr/lib/librt.so.1 (0x00007fa958e8e000)
        libutil.so.1 => /usr/lib/libutil.so.1 (0x00007fa958c8b000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fa958a75000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fa958858000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007fa9584b7000)
        libimf.so => 
/home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libimf.so
 (0x00007fa957fb9000)
        libsvml.so => 
/home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libsvml.so
 (0x00007fa9570ad000)
        libirng.so => 
/home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libirng.so
 (0x00007fa956d3b000)
        libintlc.so.5 => 
/home/giacomo/intel/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64/libintlc.so.5
 (0x00007fa956acf000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fa959ab9000)'

I can't provide a core file, because I can't compile or launch any program with 
mpifort... I've always the error 'core dumped' also when I try to compile a 
program with mpifort, and of course there isn't any core file.


Giacomo Rossi Ph.D., Space Engineer

Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" 
University of Rome
p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com

Member of Fortran-FOSS-programmers


2016-05-05 8:50 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>:
I’ve installed the latest version of Intel Parallel Studio (16.0.3), then I’ve 
downloaded the latest version of openmpi (1.10.2) and I’ve compiled it with

`./configure CC=icc CXX=icpc F77=ifort FC=ifort 
--prefix=/opt/openmpi/1.10.2/intel/16.0.3`

then I've installed and everything seems ok, but when I try the simple command

' /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v'

I receive the following error

'Segmentation fault (core dumped)'

I'm on ArchLinux, with kernel 4.5.1-1-ARCH; I've attache to this email the 
config.log file compressed with bzip2.

Any help will be appreciated!



Giacomo Rossi Ph.D., Space Engineer

Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" 
University of Rome
p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com

Member of Fortran-FOSS-programmers






_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/05/29108.php




--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to