Hi,
what segfaulted ? I am not sure...maybe application is bug showing up with
static OpenMPI.
I try to compile & run simplest MPI example and I shall let you know.
In betweem I am attaching debugger output would help to track this bug:
Backtrace for this error:
+ function __restore_rt (0x255B110)
from file sigaction.c
slave (mpi processes are from #10):
(gdb) where
#0 0x00000000023622db in sm_fifo_read (fifo=0x7f77cc908300) at btl_sm.h:324
#1 0x000000000236309b in mca_btl_sm_component_progress () at
btl_sm_component.c:612
#2 0x0000000002304f26 in opal_progress () at runtime/opal_progress.c:207
#3 0x00000000023c8a77 in opal_condition_wait (c=0xf78bf80, m=0xf78c000) at
../../../../opal/threads/condition.h:100
#4 0x00000000023c8eb7 in ompi_request_wait_completion (req=0x10602f00) at
../../../../ompi/request/request.h:378
#5 0x00000000023ca661 in mca_pml_ob1_send (buf=0xefbb2a0, count=1000,
datatype=0x2901180, dst=1, tag=-17,
sendmode=MCA_PML_BASE_SEND_STANDARD, comm=0xf772b20) at pml_ob1_isend.c:125
#6 0x000000000236e978 in ompi_coll_tuned_bcast_intra_split_bintree
(buffer=0xefb9360, count=2000, datatype=0x2901180, root=0,
comm=0xf772b20, module=0x1060b7f0, segsize=1024) at coll_tuned_bcast.c:590
#7 0x0000000002370834 in ompi_coll_tuned_bcast_intra_dec_fixed
(buff=0xefb9360, count=2000, datatype=0x2901180, root=0,
comm=0xf772b20, module=0x1060b7f0) at coll_tuned_decision_fixed.c:262
#8 0x0000000002371c52 in mca_coll_sync_bcast (buff=0xefb9360, count=2000,
datatype=0x2901180, root=0, comm=0xf772b20,
module=0x1060b590) at coll_sync_bcast.c:44
#9 0x0000000002249662 in PMPI_Bcast (buffer=0xefb9360, count=2000,
datatype=0x2901180, root=0, comm=0xf772b20) at pbcast.c:110
#10 0x000000000221744a in mpi_bcast_f (
buffer=0xefb9360
"\026\372`\031\033\336O@\005\031\001\025\216\260&@\301\343ۻ\006\375\003@\251L1\aAG\344?\301\343ۻ\006\375\003@\251L1\aAG\344?HN&n\025\233\\@\252\325WW\005^4@8\333ܘ\236`\027@\025\253\006an\267\367?Ih˹\024W\324?8\021\375\332\372\351\273?8\333ܘ\236`\027@\025\253\006an\267\367?Ih˹\024W\324?8\021\375\332\372\351\273?\301\343ۻ\006\375\003@\251L1\aAG\344?\026\372`\031\033\336O@\005\031\001\025\216\260&@\301\343ۻ\006\375\003@\251L1\aAG\344?\301\343ۻ\006\375\003@\251L1\aAG\344?8\333ܘ\236`\027@"...,
count=0x2624818, datatype=0x26037a0, root=0xf73ba90, comm=0x26247a0,
ierr=0x7fffbb34ec78) at pbcast_f.c:70
#11 0x000000000041ab68 in interface_to_mpi::interface_mpi_bcast_r1 (x=<value
optimized out>, ndim=2000, root_proc=0, communicator=0)
at
/home/ilias/qch_work/qch_software/dirac_git/dirac-git-repo/interface_mpi/interface_to_mpi.F90:446
#12 0x0000000000e95a37 in get_primitf () at
/home/ilias/qch_work/qch_software/dirac_git/dirac-git-repo/abacus/herpar.F:1464
#13 0x0000000000e99ff7 in sdinit (dmat=..., ndmat=2, irepdm=..., ifctyp=...,
itype=9, maxdif=<value optimized out>, iatom=0,
nodv=.TRUE., nopv=.TRUE., nocont=.FALSE., tktime=.FALSE., retur=.FALSE.,
i2typ=1, icedif=3, screen=9.9999999999999998e-13,
gabrao=..., dmrao=..., dmrso=...) at
/home/ilias/qch_work/qch_software/dirac_git/dirac-git-repo/abacus/herpar.F:566
#14 0x0000000000e9af35 in her_pardrv (work=..., lwork=<value optimized out>,
fmat=..., dmat=..., ndmat=2, irepdm=..., ifctyp=...,
.
.
.
and master:
(gdb) where
#0 0x000000000058de18 in poll ()
#1 0x0000000000496f58 in poll_dispatch ()
#2 0x0000000000471649 in opal_libevent2013_event_base_loop ()
#3 0x00000000004016ea in orterun (argc=4, argv=0x7fff484b6478) at orterun.c:866
#4 0x00000000004005d4 in main (argc=4, argv=0x7fff484b6478) at main.c:13
________________________________________
From: Ilias Miroslav
Sent: Monday, January 30, 2012 7:24 PM
To: [email protected]
Subject: Re: pure static "mpirun" launcher (Jeff Squyres) - now testing
Hi Jeff,
thanks for the fix;
I downloaded the Open MPI trunk and have built it up,
the (most recent) revision 25818 is giving this error and hangs:
/home/ilias/bin/ompi_ilp64_static/bin/mpirun -np 2 ./dirac.x
.
.
Program received signal 11 (SIGSEGV): Segmentation fault.
Backtrace for this error:
+ function __restore_rt (0x255B110)
from file sigaction.c
The configuration:
$ ./configure --prefix=/home/ilias/bin/ompi_ilp64_static
--without-memory-manager LDFLAGS=--static --disable-shared --enable-static
CXX=g++ CC=gcc F77=gfortran FC=gfortran FFLAGS=-m64 -fdefault-integer-8
FCFLAGS=-m64 -fdefault-integer-8 CFLAGS=-m64 CXXFLAGS=-m64
--enable-ltdl-convenience --no-create --no-recursion
The "dirac.x" static executable was obtained with this static openmpi:
write(lupri, '(a)') ' System | Linux-2.6.30-1-amd64'
write(lupri, '(a)') ' Processor | x86_64'
write(lupri, '(a)') ' Internal math | ON'
write(lupri, '(a)') ' 64-bit integers | ON'
write(lupri, '(a)') ' MPI | ON'
write(lupri, '(a)') ' Fortran compiler |
/home/ilias/bin/ompi_ilp64_static/bin/mpif90'
write(lupri, '(a)') ' Fortran compiler version | GNU Fortran (Debian
4.6.2-9) 4.6.2'
write(lupri, '(a)') ' Fortran flags | -g -fcray-pointer
-fbacktrace -DVAR_GFORTRAN -DVAR'
write(lupri, '(a)') ' | _MFDS -fno-range-check
-static -fdefault-integer-8'
write(lupri, '(a)') ' | -O3 -funroll-all-loops'
write(lupri, '(a)') ' C compiler |
/home/ilias/bin/ompi_ilp64_static/bin/mpicc'
write(lupri, '(a)') ' C compiler version | gcc (Debian 4.6.2-9) 4.6.2'
write(lupri, '(a)') ' C flags | -g -static -fpic -O2
-Wno-unused'
write(lupri, '(a)') ' static libraries linking | ON'
ldd dirac.x
not a dynamic executable
Any help, please ? How to include MPI-debug statements ?
1. Re: pure static "mpirun" launcher (Jeff Squyres)
----------------------------------------------------------------------
Message: 1
List-Post: [email protected]
Date: Fri, 27 Jan 2012 13:44:49 -0500
From: Jeff Squyres <[email protected]>
Subject: Re: [OMPI users] pure static "mpirun" launcher
To: Open MPI Users <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset=us-ascii
Ah ha, I think I got it. There was actually a bug about disabling the memory
manager in trunk/v1.5.x/v1.4.x. I fixed it on the trunk and scheduled it for
v1.6 (since we're trying very hard to get v1.5.5 out the door) and v1.4.5.
On the OMPI trunk on RHEL 5 with gcc 4.4.6, I can do this:
./configure --without-memory-manager LDFLAGS=--static --disable-shared
--enable-static
And get a fully static set of OMPI executables. For example:
-----
[10:41] svbu-mpi:~ % cd $prefix/bin
[10:41] svbu-mpi:/home/jsquyres/bogus/bin % ldd *
mpic++:
not a dynamic executable
mpicc:
not a dynamic executable
mpiCC:
not a dynamic executable
mpicxx:
not a dynamic executable
mpiexec:
not a dynamic executable
mpif77:
not a dynamic executable
mpif90:
not a dynamic executable
mpirun:
not a dynamic executable
ompi-clean:
not a dynamic executable
ompi_info:
not a dynamic executable
ompi-ps:
not a dynamic executable
ompi-server:
not a dynamic executable
ompi-top:
not a dynamic executable
opal_wrapper:
not a dynamic executable
ortec++:
not a dynamic executable
ortecc:
not a dynamic executable
orteCC:
not a dynamic executable
orte-clean:
not a dynamic executable
orted:
not a dynamic executable
orte-info:
not a dynamic executable
orte-ps:
not a dynamic executable
orterun:
not a dynamic executable
orte-top:
not a dynamic executable
-----
So I think the answer here is: it depends on a few factors:
1. Need that bug fix that I just committed.
2. Libtool is stripping out -static (and/or --static?). So you have to find
some other flags to make your compiler/linker do static.
3. Your OS has to support static builds. For example, RHEL6 doesn't install
libc.a by default (it's apparently on the optional DVD, which I don't have).
My RHEL 5.5 install does have it, though.
On Jan 27, 2012, at 11:16 AM, Jeff Squyres wrote:
> I've tried a bunch of variations on this, but I'm actually getting stymied by
> my underlying OS not supporting static linking properly. :-\
>
> I do see that Libtool is stripping out the "-static" standalone flag that you
> passed into LDFLAGS. Yuck. What's -Wl,-E? Can you try "-Wl,-static"
> instead?
>
>
> On Jan 25, 2012, at 1:24 AM, Ilias Miroslav wrote:
>
>> Hello again,
>>
>> I need own static "mpirun" for porting (together with the static executable)
>> onto various (unknown) grid servers. In grid computing one can not expect
>> OpenMPI-ILP64 installtion on each computing element.
>>
>> Jeff: I tried LDFLAGS in configure
>>
>> [email protected]:~/bin/ompi-ilp64_full_static/openmpi-1.4.4/../configure
>> --prefix=/home/ilias/bin/ompi-ilp64_full_static -without-memory-manager
>> --without-libnuma --enable-static --disable-shared CXX=g++ CC=gcc
>> F77=gfortran FC=gfortran FFLAGS="-m64 -fdefault-integer-8 -static"
>> FCFLAGS="-m64 -fdefault-integer-8 -static" CFLAGS="-m64 -static"
>> CXXFLAGS="-m64 -static" LDFLAGS="-static -Wl,-E"
>>
>> but still got dynamic, not static "mpirun":
>> [email protected]:~/bin/ompi-ilp64_full_static/bin/.ldd ./mpirun
>> linux-vdso.so.1 => (0x00007fff6090c000)
>> libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd7277cf000)
>> libnsl.so.1 => /lib/x86_64-linux-gnu/libnsl.so.1 (0x00007fd7275b7000)
>> libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fd7273b3000)
>> libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd727131000)
>> libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
>> (0x00007fd726f15000)
>> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd726b90000)
>> /lib64/ld-linux-x86-64.so.2 (0x00007fd7279ef000)
>>
>> Any help please ? config.log is here:
>>
>> https://docs.google.com/open?id=0B8qBHKNhZAipNTNkMzUxZDEtNjJmZi00YzY3LWI4MmYtY2RkZDVkMjhiOTM1
>>
>> Best, Miro
>> ------------------------------
>> Message: 10
>> Date: Tue, 24 Jan 2012 11:55:21 -0500
>> From: Jeff Squyres <[email protected]>
>> Subject: Re: [OMPI users] pure static "mpirun" launcher
>> To: Open MPI Users <[email protected]>
>> Message-ID: <[email protected]>
>> Content-Type: text/plain; charset=windows-1252
>>
>> Ilias: Have you simply tried building Open MPI with flags that force static
>> linking? E.g., something like this:
>>
>> ./configure --enable-static --disable-shared LDFLAGS=-Wl,-static
>>
>> I.e., put in LDFLAGS whatever flags your compiler/linker needs to force
>> static linking. These LDFLAGS will be applied to all of Open MPI's
>> executables, including mpirun.
>>
>>
>> On Jan 24, 2012, at 10:28 AM, Ralph Castain wrote:
>>
>>> Good point! I'm traveling this week with limited resources, but will try to
>>> address when able.
>>>
>>> Sent from my iPad
>>>
>>> On Jan 24, 2012, at 7:07 AM, Reuti <[email protected]> wrote:
>>>
>>>> Am 24.01.2012 um 15:49 schrieb Ralph Castain:
>>>>
>>>>> I'm a little confused. Building procs static makes sense as libraries may
>>>>> not be available on compute nodes. However, mpirun is only executed in
>>>>> one place, usually the head node where it was built. So there is less
>>>>> reason to build it purely static.
>>>>>
>>>>> Are you trying to move mpirun somewhere? Or is it the daemons that mpirun
>>>>> launches that are the real problem?
>>>>
>>>> This depends: if you have a queuing system, the master node of a parallel
>>>> job may be one of the slave nodes already where the jobscript runs.
>>>> Nevertheless I have the nodes uniform, but I saw places where it wasn't
>>>> the case.
>>>>
>>>> An option would be to have a special queue, which will execute the
>>>> jobscript always on the headnode (i.e. without generating any load) and
>>>> use only non-local granted slots for mpirun. For this it might be necssary
>>>> to have a high number of slots on the headnode for this queue, and request
>>>> always one slot on this machine in addition to the necessary ones on the
>>>> computing node.
>>>>
>>>> -- Reuti
>>>>
>>>>
>>>>> Sent from my iPad
>>>>>
>>>>> On Jan 24, 2012, at 5:54 AM, Ilias Miroslav <[email protected]> wrote:
>>>>>
>>>>>> Dear experts,
>>>>>>
>>>>>> following http://www.open-mpi.org/faq/?category=building#static-build I
>>>>>> successfully build static OpenMPI library.
>>>>>> Using such prepared library I succeeded in building parallel static
>>>>>> executable - dirac.x (ldd dirac.x-not a dynamic executable).
>>>>>>
>>>>>> The problem remains, however, with the mpirun (orterun) launcher.
>>>>>> While on the local machine, where I compiled both static OpenMPI &
>>>>>> static dirac.x I am able to launch parallel job
>>>>>> <OpenMPI_static>/mpirun -np 2 dirac.x ,
>>>>>> I can not lauch it elsewhere, because "mpirun" is dynamically linked,
>>>>>> thus machine dependent:
>>>>>>
>>>>>> ldd mpirun:
>>>>>> linux-vdso.so.1 => (0x00007fff13792000)
>>>>>> libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f40f8cab000)
>>>>>> libnsl.so.1 => /lib/x86_64-linux-gnu/libnsl.so.1 (0x00007f40f8a93000)
>>>>>> libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f40f888f000)
>>>>>> libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f40f860d000)
>>>>>> libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
>>>>>> (0x00007f40f83f1000)
>>>>>> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f40f806c000)
>>>>>> /lib64/ld-linux-x86-64.so.2 (0x00007f40f8ecb000)
>>>>>>
>>>>>> Please how to I build "pure" static mpirun launcher, usable (in my case
>>>>>> together with static dirac.x) also on other computers ?
>>>>>>
>>>>>> Thanks, Miro
>>>>>>
>>>>>> --
>>>>>> RNDr. Miroslav Ilia?, PhD.
>>>>>>
>>>>>> Katedra ch?mie
>>>>>> Fakulta pr?rodn?ch vied
>>>>>> Univerzita Mateja Bela
>>>>>> Tajovsk?ho 40
>>>>>> 97400 Bansk? Bystrica
>>>>>> tel: +421 48 446 7351
>>>>>> email : [email protected]
>>>>>>
>>>>>> Department of Chemistry
>>>>>> Faculty of Natural Sciences
>>>>>> Matej Bel University
>>>>>> Tajovsk?ho 40
>>>>>> 97400 Banska Bystrica
>>>>>> Slovakia
>>>>>> tel: +421 48 446 7351
>>>>>> email : [email protected]
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> [email protected]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> [email protected]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> [email protected]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> [email protected]
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> users mailing list
>> [email protected]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> End of users Digest, Vol 2133, Issue 1
>> **************************************
>>
>> _______________________________________________
>> users mailing list
>> [email protected]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> [email protected]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
[email protected]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/