Hi Paul,

Thanks very much Christmas present.

The Open MPI README has been updated
to include a note about issues with the Intel 16.0.3-4 compiler suites.

Enjoy the holidays,

Howard


2016-12-23 3:41 GMT-07:00 Paul Kapinos <kapi...@itc.rwth-aachen.de>:

> Hi all,
>
> we discussed this issue with Intel compiler support and it looks like they
> now know what the issue is and how to protect after. It is a known issue
> resulting from a backwards incompatibility in an OS/glibc update, cf.
> https://sourceware.org/bugzilla/show_bug.cgi?id=20019
>
> Affected versions of the Intel compilers: 16.0.3, 16.0.4
> Not affected versions: 16.0.2, 17.0
>
> So, simply do not use affected versions (and hope on an bugfix update in
> 16x series if you cannot immediately upgrade to 17x, like we, despite this
> is the favourite option from Intel).
>
> Have a nice Christmas time!
>
> Paul Kapinos
>
> On 12/14/16 13:29, Paul Kapinos wrote:
>
>> Hello all,
>> we seem to run into the same issue: 'mpif90' sigsegvs immediately for
>> Open MPI
>> 1.10.4 compiled using Intel compilers 16.0.4.258 and 16.0.3.210, while it
>> works
>> fine when compiled with 16.0.2.181.
>>
>> It seems to be a compiler issue (more exactly: library issue on libs
>> delivered
>> with 16.0.4.258 and 16.0.3.210 versions). Changing the version of compiler
>> loaded back to 16.0.2.181 (=> change of dynamically loaded libs) let the
>> prevously-failing binary (compiled with newer compilers) to work
>> propperly.
>>
>> Compiling with -O0 does not help. As the issue is likely in the Intel
>> libs (as
>> said changing out these solves/raises the issue) we will do a failback to
>> 16.0.2.181 compiler version. We will try to open a case by Intel - let's
>> see...
>>
>> Have a nice day,
>>
>> Paul Kapinos
>>
>>
>>
>> On 05/06/16 14:10, Jeff Squyres (jsquyres) wrote:
>>
>>> Ok, good.
>>>
>>> I asked that question because typically when we see errors like this, it
>>> is
>>> usually either a busted compiler installation or inadvertently mixing the
>>> run-times of multiple different compilers in some kind of incompatible
>>> way.
>>> Specifically, the mpifort (aka mpif90) application is a fairly simple
>>> program
>>> -- there's no reason it should segv, especially with a stack trace that
>>> you
>>> sent that implies that it's dying early in startup, potentially even
>>> before it
>>> has hit any Open MPI code (i.e., it could even be pre-main).
>>>
>>> BTW, you might be able to get a more complete stack trace from the
>>> debugger
>>> that comes with the Intel compiler (idb?  I don't remember offhand).
>>>
>>> Since you are able to run simple programs compiled by this compiler, it
>>> sounds
>>> like the compiler is working fine.  Good!
>>>
>>> The next thing to check is to see if somehow the compiler and/or run-time
>>> environments are getting mixed up.  E.g., the apps were compiled for one
>>> compiler/run-time but are being used with another.  Also ensure that any
>>> compiler/linker flags that you are passing to Open MPI's configure
>>> script are
>>> native and correct for the platform for which you're compiling (e.g.,
>>> don't
>>> pass in flags that optimize for a different platform; that may result in
>>> generating machine code instructions that are invalid for your platform).
>>>
>>> Try recompiling/re-installing Open MPI from scratch, and if it still
>>> doesn't
>>> work, then send all the information listed here:
>>>
>>>     https://www.open-mpi.org/community/help/
>>>
>>>
>>> On May 6, 2016, at 3:45 AM, Giacomo Rossi <giacom...@gmail.com> wrote:
>>>>
>>>> Yes, I've tried three simple "Hello world" programs in fortan, C and
>>>> C++ and
>>>> the compile and run with intel 16.0.3. The problem is with the openmpi
>>>> compiled from source.
>>>>
>>>> Giacomo Rossi Ph.D., Space Engineer
>>>>
>>>> Research Fellow at Dept. of Mechanical and Aerospace Engineering,
>>>> "Sapienza"
>>>> University of Rome
>>>> p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com
>>>>
>>>> Member of Fortran-FOSS-programmers
>>>>
>>>>
>>>> 2016-05-05 11:15 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>:
>>>>  gdb /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90
>>>> GNU gdb (GDB) 7.11
>>>> Copyright (C) 2016 Free Software Foundation, Inc.
>>>> License GPLv3+: GNU GPL version 3 or later <
>>>> http://gnu.org/licenses/gpl.html>
>>>> This is free software: you are free to change and redistribute it.
>>>> There is NO WARRANTY, to the extent permitted by law.  Type "show
>>>> copying"
>>>> and "show warranty" for details.
>>>> This GDB was configured as "x86_64-pc-linux-gnu".
>>>> Type "show configuration" for configuration details.
>>>> For bug reporting instructions, please see:
>>>> <http://www.gnu.org/software/gdb/bugs/>.
>>>> Find the GDB manual and other documentation resources online at:
>>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>> For help, type "help".
>>>> Type "apropos word" to search for commands related to "word"...
>>>> Reading symbols from /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90...(no
>>>> debugging symbols found)...done.
>>>> (gdb) r -v
>>>> Starting program: /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v
>>>>
>>>> Program received signal SIGSEGV, Segmentation fault.
>>>> 0x00007ffff6858f38 in ?? ()
>>>> (gdb) bt
>>>> #0  0x00007ffff6858f38 in ?? ()
>>>> #1  0x00007ffff7de5828 in _dl_relocate_object () from
>>>> /lib64/ld-linux-x86-64.so.2
>>>> #2  0x00007ffff7ddcfa3 in dl_main () from /lib64/ld-linux-x86-64.so.2
>>>> #3  0x00007ffff7df029c in _dl_sysdep_start () from
>>>> /lib64/ld-linux-x86-64.so.2
>>>> #4  0x00007ffff7dddd4a in _dl_start () from /lib64/ld-linux-x86-64.so.2
>>>> #5  0x00007ffff7dd9d98 in _start () from /lib64/ld-linux-x86-64.so.2
>>>> #6  0x0000000000000002 in ?? ()
>>>> #7  0x00007fffffffaa8a in ?? ()
>>>> #8  0x00007fffffffaab6 in ?? ()
>>>> #9  0x0000000000000000 in ?? ()
>>>>
>>>> Giacomo Rossi Ph.D., Space Engineer
>>>>
>>>> Research Fellow at Dept. of Mechanical and Aerospace Engineering,
>>>> "Sapienza"
>>>> University of Rome
>>>> p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com
>>>>
>>>> Member of Fortran-FOSS-programmers
>>>>
>>>>
>>>> 2016-05-05 10:44 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>:
>>>> Here the result of ldd command:
>>>> 'ldd /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90
>>>>     linux-vdso.so.1 (0x00007ffcacbbe000)
>>>>     libopen-pal.so.13 =>
>>>> /opt/openmpi/1.10.2/intel/16.0.3/lib/libopen-pal.so.13
>>>> (0x00007fa9597a9000)
>>>>     libm.so.6 => /usr/lib/libm.so.6 (0x00007fa9594a4000)
>>>>     libpciaccess.so.0 => /usr/lib/libpciaccess.so.0 (0x00007fa95929a000)
>>>>     libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fa959096000)
>>>>     librt.so.1 => /usr/lib/librt.so.1 (0x00007fa958e8e000)
>>>>     libutil.so.1 => /usr/lib/libutil.so.1 (0x00007fa958c8b000)
>>>>     libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fa958a75000)
>>>>     libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fa958858000)
>>>>     libc.so.6 => /usr/lib/libc.so.6 (0x00007fa9584b7000)
>>>>     libimf.so =>
>>>> /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux
>>>> /compiler/lib/intel64/libimf.so
>>>> (0x00007fa957fb9000)
>>>>     libsvml.so =>
>>>> /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux
>>>> /compiler/lib/intel64/libsvml.so
>>>> (0x00007fa9570ad000)
>>>>     libirng.so =>
>>>> /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux
>>>> /compiler/lib/intel64/libirng.so
>>>> (0x00007fa956d3b000)
>>>>     libintlc.so.5 =>
>>>> /home/giacomo/intel/compilers_and_libraries_2016.3.210/linux
>>>> /compiler/lib/intel64/libintlc.so.5
>>>> (0x00007fa956acf000)
>>>>     /lib64/ld-linux-x86-64.so.2 (0x00007fa959ab9000)'
>>>>
>>>> I can't provide a core file, because I can't compile or launch any
>>>> program
>>>> with mpifort... I've always the error 'core dumped' also when I try to
>>>> compile a program with mpifort, and of course there isn't any core file.
>>>>
>>>>
>>>> Giacomo Rossi Ph.D., Space Engineer
>>>>
>>>> Research Fellow at Dept. of Mechanical and Aerospace Engineering,
>>>> "Sapienza"
>>>> University of Rome
>>>> p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com
>>>>
>>>> Member of Fortran-FOSS-programmers
>>>>
>>>>
>>>> 2016-05-05 8:50 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>:
>>>> I’ve installed the latest version of Intel Parallel Studio (16.0.3),
>>>> then
>>>> I’ve downloaded the latest version of openmpi (1.10.2) and I’ve
>>>> compiled it with
>>>>
>>>> `./configure CC=icc CXX=icpc F77=ifort FC=ifort
>>>> --prefix=/opt/openmpi/1.10.2/intel/16.0.3`
>>>>
>>>> then I've installed and everything seems ok, but when I try the simple
>>>> command
>>>>
>>>> ' /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v'
>>>>
>>>> I receive the following error
>>>>
>>>> 'Segmentation fault (core dumped)'
>>>>
>>>> I'm on ArchLinux, with kernel 4.5.1-1-ARCH; I've attache to this email
>>>> the
>>>> config.log file compressed with bzip2.
>>>>
>>>> Any help will be appreciated!
>>>>
>>>>
>>>>
>>>> Giacomo Rossi Ph.D., Space Engineer
>>>>
>>>> Research Fellow at Dept. of Mechanical and Aerospace Engineering,
>>>> "Sapienza"
>>>> University of Rome
>>>> p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com
>>>>
>>>> Member of Fortran-FOSS-programmers
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/05/29108.php
>>>>
>>>
>>>
>>>
>>
>>
>
> --
> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
> RWTH Aachen University, IT Center
> Seffenter Weg 23,  D 52074  Aachen (Germany)
> Tel: +49 241/80-24915
>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to