Hi Paul, Thanks very much Christmas present.
The Open MPI README has been updated to include a note about issues with the Intel 16.0.3-4 compiler suites. Enjoy the holidays, Howard 2016-12-23 3:41 GMT-07:00 Paul Kapinos <kapi...@itc.rwth-aachen.de>: > Hi all, > > we discussed this issue with Intel compiler support and it looks like they > now know what the issue is and how to protect after. It is a known issue > resulting from a backwards incompatibility in an OS/glibc update, cf. > https://sourceware.org/bugzilla/show_bug.cgi?id=20019 > > Affected versions of the Intel compilers: 16.0.3, 16.0.4 > Not affected versions: 16.0.2, 17.0 > > So, simply do not use affected versions (and hope on an bugfix update in > 16x series if you cannot immediately upgrade to 17x, like we, despite this > is the favourite option from Intel). > > Have a nice Christmas time! > > Paul Kapinos > > On 12/14/16 13:29, Paul Kapinos wrote: > >> Hello all, >> we seem to run into the same issue: 'mpif90' sigsegvs immediately for >> Open MPI >> 1.10.4 compiled using Intel compilers 16.0.4.258 and 16.0.3.210, while it >> works >> fine when compiled with 16.0.2.181. >> >> It seems to be a compiler issue (more exactly: library issue on libs >> delivered >> with 16.0.4.258 and 16.0.3.210 versions). Changing the version of compiler >> loaded back to 16.0.2.181 (=> change of dynamically loaded libs) let the >> prevously-failing binary (compiled with newer compilers) to work >> propperly. >> >> Compiling with -O0 does not help. As the issue is likely in the Intel >> libs (as >> said changing out these solves/raises the issue) we will do a failback to >> 16.0.2.181 compiler version. We will try to open a case by Intel - let's >> see... >> >> Have a nice day, >> >> Paul Kapinos >> >> >> >> On 05/06/16 14:10, Jeff Squyres (jsquyres) wrote: >> >>> Ok, good. >>> >>> I asked that question because typically when we see errors like this, it >>> is >>> usually either a busted compiler installation or inadvertently mixing the >>> run-times of multiple different compilers in some kind of incompatible >>> way. >>> Specifically, the mpifort (aka mpif90) application is a fairly simple >>> program >>> -- there's no reason it should segv, especially with a stack trace that >>> you >>> sent that implies that it's dying early in startup, potentially even >>> before it >>> has hit any Open MPI code (i.e., it could even be pre-main). >>> >>> BTW, you might be able to get a more complete stack trace from the >>> debugger >>> that comes with the Intel compiler (idb? I don't remember offhand). >>> >>> Since you are able to run simple programs compiled by this compiler, it >>> sounds >>> like the compiler is working fine. Good! >>> >>> The next thing to check is to see if somehow the compiler and/or run-time >>> environments are getting mixed up. E.g., the apps were compiled for one >>> compiler/run-time but are being used with another. Also ensure that any >>> compiler/linker flags that you are passing to Open MPI's configure >>> script are >>> native and correct for the platform for which you're compiling (e.g., >>> don't >>> pass in flags that optimize for a different platform; that may result in >>> generating machine code instructions that are invalid for your platform). >>> >>> Try recompiling/re-installing Open MPI from scratch, and if it still >>> doesn't >>> work, then send all the information listed here: >>> >>> https://www.open-mpi.org/community/help/ >>> >>> >>> On May 6, 2016, at 3:45 AM, Giacomo Rossi <giacom...@gmail.com> wrote: >>>> >>>> Yes, I've tried three simple "Hello world" programs in fortan, C and >>>> C++ and >>>> the compile and run with intel 16.0.3. The problem is with the openmpi >>>> compiled from source. >>>> >>>> Giacomo Rossi Ph.D., Space Engineer >>>> >>>> Research Fellow at Dept. of Mechanical and Aerospace Engineering, >>>> "Sapienza" >>>> University of Rome >>>> p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com >>>> >>>> Member of Fortran-FOSS-programmers >>>> >>>> >>>> 2016-05-05 11:15 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>: >>>> gdb /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 >>>> GNU gdb (GDB) 7.11 >>>> Copyright (C) 2016 Free Software Foundation, Inc. >>>> License GPLv3+: GNU GPL version 3 or later < >>>> http://gnu.org/licenses/gpl.html> >>>> This is free software: you are free to change and redistribute it. >>>> There is NO WARRANTY, to the extent permitted by law. Type "show >>>> copying" >>>> and "show warranty" for details. >>>> This GDB was configured as "x86_64-pc-linux-gnu". >>>> Type "show configuration" for configuration details. >>>> For bug reporting instructions, please see: >>>> <http://www.gnu.org/software/gdb/bugs/>. >>>> Find the GDB manual and other documentation resources online at: >>>> <http://www.gnu.org/software/gdb/documentation/>. >>>> For help, type "help". >>>> Type "apropos word" to search for commands related to "word"... >>>> Reading symbols from /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90...(no >>>> debugging symbols found)...done. >>>> (gdb) r -v >>>> Starting program: /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v >>>> >>>> Program received signal SIGSEGV, Segmentation fault. >>>> 0x00007ffff6858f38 in ?? () >>>> (gdb) bt >>>> #0 0x00007ffff6858f38 in ?? () >>>> #1 0x00007ffff7de5828 in _dl_relocate_object () from >>>> /lib64/ld-linux-x86-64.so.2 >>>> #2 0x00007ffff7ddcfa3 in dl_main () from /lib64/ld-linux-x86-64.so.2 >>>> #3 0x00007ffff7df029c in _dl_sysdep_start () from >>>> /lib64/ld-linux-x86-64.so.2 >>>> #4 0x00007ffff7dddd4a in _dl_start () from /lib64/ld-linux-x86-64.so.2 >>>> #5 0x00007ffff7dd9d98 in _start () from /lib64/ld-linux-x86-64.so.2 >>>> #6 0x0000000000000002 in ?? () >>>> #7 0x00007fffffffaa8a in ?? () >>>> #8 0x00007fffffffaab6 in ?? () >>>> #9 0x0000000000000000 in ?? () >>>> >>>> Giacomo Rossi Ph.D., Space Engineer >>>> >>>> Research Fellow at Dept. of Mechanical and Aerospace Engineering, >>>> "Sapienza" >>>> University of Rome >>>> p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com >>>> >>>> Member of Fortran-FOSS-programmers >>>> >>>> >>>> 2016-05-05 10:44 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>: >>>> Here the result of ldd command: >>>> 'ldd /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 >>>> linux-vdso.so.1 (0x00007ffcacbbe000) >>>> libopen-pal.so.13 => >>>> /opt/openmpi/1.10.2/intel/16.0.3/lib/libopen-pal.so.13 >>>> (0x00007fa9597a9000) >>>> libm.so.6 => /usr/lib/libm.so.6 (0x00007fa9594a4000) >>>> libpciaccess.so.0 => /usr/lib/libpciaccess.so.0 (0x00007fa95929a000) >>>> libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fa959096000) >>>> librt.so.1 => /usr/lib/librt.so.1 (0x00007fa958e8e000) >>>> libutil.so.1 => /usr/lib/libutil.so.1 (0x00007fa958c8b000) >>>> libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fa958a75000) >>>> libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fa958858000) >>>> libc.so.6 => /usr/lib/libc.so.6 (0x00007fa9584b7000) >>>> libimf.so => >>>> /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux >>>> /compiler/lib/intel64/libimf.so >>>> (0x00007fa957fb9000) >>>> libsvml.so => >>>> /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux >>>> /compiler/lib/intel64/libsvml.so >>>> (0x00007fa9570ad000) >>>> libirng.so => >>>> /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux >>>> /compiler/lib/intel64/libirng.so >>>> (0x00007fa956d3b000) >>>> libintlc.so.5 => >>>> /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux >>>> /compiler/lib/intel64/libintlc.so.5 >>>> (0x00007fa956acf000) >>>> /lib64/ld-linux-x86-64.so.2 (0x00007fa959ab9000)' >>>> >>>> I can't provide a core file, because I can't compile or launch any >>>> program >>>> with mpifort... I've always the error 'core dumped' also when I try to >>>> compile a program with mpifort, and of course there isn't any core file. >>>> >>>> >>>> Giacomo Rossi Ph.D., Space Engineer >>>> >>>> Research Fellow at Dept. of Mechanical and Aerospace Engineering, >>>> "Sapienza" >>>> University of Rome >>>> p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com >>>> >>>> Member of Fortran-FOSS-programmers >>>> >>>> >>>> 2016-05-05 8:50 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>: >>>> I’ve installed the latest version of Intel Parallel Studio (16.0.3), >>>> then >>>> I’ve downloaded the latest version of openmpi (1.10.2) and I’ve >>>> compiled it with >>>> >>>> `./configure CC=icc CXX=icpc F77=ifort FC=ifort >>>> --prefix=/opt/openmpi/1.10.2/intel/16.0.3` >>>> >>>> then I've installed and everything seems ok, but when I try the simple >>>> command >>>> >>>> ' /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v' >>>> >>>> I receive the following error >>>> >>>> 'Segmentation fault (core dumped)' >>>> >>>> I'm on ArchLinux, with kernel 4.5.1-1-ARCH; I've attache to this email >>>> the >>>> config.log file compressed with bzip2. >>>> >>>> Any help will be appreciated! >>>> >>>> >>>> >>>> Giacomo Rossi Ph.D., Space Engineer >>>> >>>> Research Fellow at Dept. of Mechanical and Aerospace Engineering, >>>> "Sapienza" >>>> University of Rome >>>> p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com >>>> >>>> Member of Fortran-FOSS-programmers >>>> >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2016/05/29108.php >>>> >>> >>> >>> >> >> > > -- > Dipl.-Inform. Paul Kapinos - High Performance Computing, > RWTH Aachen University, IT Center > Seffenter Weg 23, D 52074 Aachen (Germany) > Tel: +49 241/80-24915 > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users