Hi,

what segfaulted ? I am not sure...maybe application is bug showing up with 
static OpenMPI.

I try to compile & run simplest MPI example and I shall let you know.

In betweem I am attaching debugger output would help to track this bug:

Backtrace for this error:
  + function __restore_rt (0x255B110)
    from file sigaction.c


slave (mpi processes are from #10):
(gdb) where
#0  0x00000000023622db in sm_fifo_read (fifo=0x7f77cc908300) at btl_sm.h:324
#1  0x000000000236309b in mca_btl_sm_component_progress () at 
btl_sm_component.c:612
#2  0x0000000002304f26 in opal_progress () at runtime/opal_progress.c:207
#3  0x00000000023c8a77 in opal_condition_wait (c=0xf78bf80, m=0xf78c000) at 
../../../../opal/threads/condition.h:100
#4  0x00000000023c8eb7 in ompi_request_wait_completion (req=0x10602f00) at 
../../../../ompi/request/request.h:378
#5  0x00000000023ca661 in mca_pml_ob1_send (buf=0xefbb2a0, count=1000, 
datatype=0x2901180, dst=1, tag=-17,
    sendmode=MCA_PML_BASE_SEND_STANDARD, comm=0xf772b20) at pml_ob1_isend.c:125
#6  0x000000000236e978 in ompi_coll_tuned_bcast_intra_split_bintree 
(buffer=0xefb9360, count=2000, datatype=0x2901180, root=0,
    comm=0xf772b20, module=0x1060b7f0, segsize=1024) at coll_tuned_bcast.c:590
#7  0x0000000002370834 in ompi_coll_tuned_bcast_intra_dec_fixed 
(buff=0xefb9360, count=2000, datatype=0x2901180, root=0,
    comm=0xf772b20, module=0x1060b7f0) at coll_tuned_decision_fixed.c:262
#8  0x0000000002371c52 in mca_coll_sync_bcast (buff=0xefb9360, count=2000, 
datatype=0x2901180, root=0, comm=0xf772b20,
    module=0x1060b590) at coll_sync_bcast.c:44
#9  0x0000000002249662 in PMPI_Bcast (buffer=0xefb9360, count=2000, 
datatype=0x2901180, root=0, comm=0xf772b20) at pbcast.c:110
#10 0x000000000221744a in mpi_bcast_f (
    buffer=0xefb9360 
"\026\372`\031\033\336O@\005\031\001\025\216\260&@\301\343ۻ\006\375\003@\251L1\aAG\344?\301\343ۻ\006\375\003@\251L1\aAG\344?HN&n\025\233\\@\252\325WW\005^4@8\333ܘ\236`\027@\025\253\006an\267\367?Ih˹\024W\324?8\021\375\332\372\351\273?8\333ܘ\236`\027@\025\253\006an\267\367?Ih˹\024W\324?8\021\375\332\372\351\273?\301\343ۻ\006\375\003@\251L1\aAG\344?\026\372`\031\033\336O@\005\031\001\025\216\260&@\301\343ۻ\006\375\003@\251L1\aAG\344?\301\343ۻ\006\375\003@\251L1\aAG\344?8\333ܘ\236`\027@"...,
    count=0x2624818, datatype=0x26037a0, root=0xf73ba90, comm=0x26247a0, 
ierr=0x7fffbb34ec78) at pbcast_f.c:70
#11 0x000000000041ab68 in interface_to_mpi::interface_mpi_bcast_r1 (x=<value 
optimized out>, ndim=2000, root_proc=0, communicator=0)
    at 
/home/ilias/qch_work/qch_software/dirac_git/dirac-git-repo/interface_mpi/interface_to_mpi.F90:446
#12 0x0000000000e95a37 in get_primitf () at 
/home/ilias/qch_work/qch_software/dirac_git/dirac-git-repo/abacus/herpar.F:1464
#13 0x0000000000e99ff7 in sdinit (dmat=..., ndmat=2, irepdm=..., ifctyp=..., 
itype=9, maxdif=<value optimized out>, iatom=0,
    nodv=.TRUE., nopv=.TRUE., nocont=.FALSE., tktime=.FALSE., retur=.FALSE., 
i2typ=1, icedif=3, screen=9.9999999999999998e-13,
    gabrao=..., dmrao=..., dmrso=...) at 
/home/ilias/qch_work/qch_software/dirac_git/dirac-git-repo/abacus/herpar.F:566
#14 0x0000000000e9af35 in her_pardrv (work=..., lwork=<value optimized out>, 
fmat=..., dmat=..., ndmat=2, irepdm=..., ifctyp=...,
. 
. 
. 
and master:
(gdb) where
#0  0x000000000058de18 in poll ()
#1  0x0000000000496f58 in poll_dispatch ()
#2  0x0000000000471649 in opal_libevent2013_event_base_loop ()
#3  0x00000000004016ea in orterun (argc=4, argv=0x7fff484b6478) at orterun.c:866
#4  0x00000000004005d4 in main (argc=4, argv=0x7fff484b6478) at main.c:13


________________________________________
From: Ilias Miroslav
Sent: Monday, January 30, 2012 7:24 PM
To: us...@open-mpi.org
Subject: Re: pure static "mpirun" launcher (Jeff Squyres) - now testing

Hi Jeff,

thanks for the fix;

I downloaded the Open MPI trunk and have built it up,

the (most recent) revision 25818 is giving this error and hangs:

/home/ilias/bin/ompi_ilp64_static/bin/mpirun -np 2   ./dirac.x
. 
. 
Program received signal 11 (SIGSEGV): Segmentation fault.

Backtrace for this error:
  + function __restore_rt (0x255B110)
    from file sigaction.c

The configuration:
  $ ./configure --prefix=/home/ilias/bin/ompi_ilp64_static 
--without-memory-manager LDFLAGS=--static --disable-shared --enable-static 
CXX=g++ CC=gcc F77=gfortran FC=gfortran FFLAGS=-m64 -fdefault-integer-8 
FCFLAGS=-m64 -fdefault-integer-8 CFLAGS=-m64 CXXFLAGS=-m64 
--enable-ltdl-convenience --no-create --no-recursion

The "dirac.x" static executable was obtained with this static openmpi:
   write(lupri, '(a)') ' System                   | Linux-2.6.30-1-amd64'
    write(lupri, '(a)') ' Processor                | x86_64'
    write(lupri, '(a)') ' Internal math            | ON'
    write(lupri, '(a)') ' 64-bit integers          | ON'
    write(lupri, '(a)') ' MPI                      | ON'
    write(lupri, '(a)') ' Fortran compiler         | 
/home/ilias/bin/ompi_ilp64_static/bin/mpif90'
    write(lupri, '(a)') ' Fortran compiler version | GNU Fortran (Debian 
4.6.2-9) 4.6.2'
    write(lupri, '(a)') ' Fortran flags            | -g -fcray-pointer 
-fbacktrace -DVAR_GFORTRAN -DVAR'
    write(lupri, '(a)') '                          | _MFDS -fno-range-check 
-static -fdefault-integer-8'
    write(lupri, '(a)') '                          |   -O3 -funroll-all-loops'
    write(lupri, '(a)') ' C compiler               | 
/home/ilias/bin/ompi_ilp64_static/bin/mpicc'
    write(lupri, '(a)') ' C compiler version       | gcc (Debian 4.6.2-9) 4.6.2'
    write(lupri, '(a)') ' C flags                  | -g -static -fpic -O2 
-Wno-unused'
    write(lupri, '(a)') ' static libraries linking | ON'

ldd dirac.x
        not a dynamic executable


Any help, please ? How to include MPI-debug statements ?







   1. Re: pure static "mpirun" launcher (Jeff Squyres)
 ----------------------------------------------------------------------
Message: 1
List-Post: users@lists.open-mpi.org
Date: Fri, 27 Jan 2012 13:44:49 -0500
From: Jeff Squyres <jsquy...@cisco.com>
Subject: Re: [OMPI users] pure static "mpirun" launcher
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <be6dbe92-784c-4594-8f4a-397a19c55...@cisco.com>
Content-Type: text/plain; charset=us-ascii

Ah ha, I think I got it.  There was actually a bug about disabling the memory 
manager in trunk/v1.5.x/v1.4.x.  I fixed it on the trunk and scheduled it for 
v1.6 (since we're trying very hard to get v1.5.5 out the door) and v1.4.5.

On the OMPI trunk on RHEL 5 with gcc 4.4.6, I can do this:

./configure --without-memory-manager LDFLAGS=--static --disable-shared 
--enable-static

And get a fully static set of OMPI executables.  For example:

-----
[10:41] svbu-mpi:~ % cd $prefix/bin
[10:41] svbu-mpi:/home/jsquyres/bogus/bin % ldd *
mpic++:
        not a dynamic executable
mpicc:
        not a dynamic executable
mpiCC:
        not a dynamic executable
mpicxx:
        not a dynamic executable
mpiexec:
        not a dynamic executable
mpif77:
        not a dynamic executable
mpif90:
        not a dynamic executable
mpirun:
        not a dynamic executable
ompi-clean:
        not a dynamic executable
ompi_info:
        not a dynamic executable
ompi-ps:
        not a dynamic executable
ompi-server:
        not a dynamic executable
ompi-top:
        not a dynamic executable
opal_wrapper:
        not a dynamic executable
ortec++:
        not a dynamic executable
ortecc:
        not a dynamic executable
orteCC:
        not a dynamic executable
orte-clean:
        not a dynamic executable
orted:
        not a dynamic executable
orte-info:
        not a dynamic executable
orte-ps:
        not a dynamic executable
orterun:
        not a dynamic executable
orte-top:
        not a dynamic executable
-----

So I think the answer here is: it depends on a few factors:

1. Need that bug fix that I just committed.
2. Libtool is stripping out -static (and/or --static?).  So you have to find 
some other flags to make your compiler/linker do static.
3. Your OS has to support static builds.  For example, RHEL6 doesn't install 
libc.a by default (it's apparently on the optional DVD, which I don't have).  
My RHEL 5.5 install does have it, though.


On Jan 27, 2012, at 11:16 AM, Jeff Squyres wrote:

> I've tried a bunch of variations on this, but I'm actually getting stymied by 
> my underlying OS not supporting static linking properly.  :-\
>
> I do see that Libtool is stripping out the "-static" standalone flag that you 
> passed into LDFLAGS.  Yuck.  What's -Wl,-E?  Can you try "-Wl,-static" 
> instead?
>
>
> On Jan 25, 2012, at 1:24 AM, Ilias Miroslav wrote:
>
>> Hello again,
>>
>> I need own static "mpirun" for porting (together with the static executable) 
>> onto various (unknown) grid servers. In grid computing one can not expect 
>> OpenMPI-ILP64 installtion on each computing element.
>>
>> Jeff: I tried LDFLAGS in configure
>>
>> ilias@194.160.135.47:~/bin/ompi-ilp64_full_static/openmpi-1.4.4/../configure 
>> --prefix=/home/ilias/bin/ompi-ilp64_full_static -without-memory-manager 
>> --without-libnuma --enable-static --disable-shared CXX=g++ CC=gcc 
>> F77=gfortran FC=gfortran FFLAGS="-m64 -fdefault-integer-8 -static" 
>> FCFLAGS="-m64 -fdefault-integer-8 -static" CFLAGS="-m64 -static" 
>> CXXFLAGS="-m64 -static"  LDFLAGS="-static  -Wl,-E"
>>
>> but still got dynamic, not static "mpirun":
>> ilias@194.160.135.47:~/bin/ompi-ilp64_full_static/bin/.ldd ./mpirun
>>      linux-vdso.so.1 =>  (0x00007fff6090c000)
>>      libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd7277cf000)
>>      libnsl.so.1 => /lib/x86_64-linux-gnu/libnsl.so.1 (0x00007fd7275b7000)
>>      libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fd7273b3000)
>>      libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd727131000)
>>      libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
>> (0x00007fd726f15000)
>>      libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd726b90000)
>>      /lib64/ld-linux-x86-64.so.2 (0x00007fd7279ef000)
>>
>> Any help please ? config.log is here:
>>
>> https://docs.google.com/open?id=0B8qBHKNhZAipNTNkMzUxZDEtNjJmZi00YzY3LWI4MmYtY2RkZDVkMjhiOTM1
>>
>> Best, Miro
>> ------------------------------
>> Message: 10
>> Date: Tue, 24 Jan 2012 11:55:21 -0500
>> From: Jeff Squyres <jsquy...@cisco.com>
>> Subject: Re: [OMPI users] pure static "mpirun" launcher
>> To: Open MPI Users <us...@open-mpi.org>
>> Message-ID: <a86d3721-9bf8-4a7d-b745-32e606521...@cisco.com>
>> Content-Type: text/plain; charset=windows-1252
>>
>> Ilias: Have you simply tried building Open MPI with flags that force static 
>> linking?  E.g., something like this:
>>
>> ./configure --enable-static --disable-shared LDFLAGS=-Wl,-static
>>
>> I.e., put in LDFLAGS whatever flags your compiler/linker needs to force 
>> static linking.  These LDFLAGS will be applied to all of Open MPI's 
>> executables, including mpirun.
>>
>>
>> On Jan 24, 2012, at 10:28 AM, Ralph Castain wrote:
>>
>>> Good point! I'm traveling this week with limited resources, but will try to 
>>> address when able.
>>>
>>> Sent from my iPad
>>>
>>> On Jan 24, 2012, at 7:07 AM, Reuti <re...@staff.uni-marburg.de> wrote:
>>>
>>>> Am 24.01.2012 um 15:49 schrieb Ralph Castain:
>>>>
>>>>> I'm a little confused. Building procs static makes sense as libraries may 
>>>>> not be available on compute nodes. However, mpirun is only executed in 
>>>>> one place, usually the head node where it was built. So there is less 
>>>>> reason to build it purely static.
>>>>>
>>>>> Are you trying to move mpirun somewhere? Or is it the daemons that mpirun 
>>>>> launches that are the real problem?
>>>>
>>>> This depends: if you have a queuing system, the master node of a parallel 
>>>> job may be one of the slave nodes already where the jobscript runs. 
>>>> Nevertheless I have the nodes uniform, but I saw places where it wasn't 
>>>> the case.
>>>>
>>>> An option would be to have a special queue, which will execute the 
>>>> jobscript always on the headnode (i.e. without generating any load) and 
>>>> use only non-local granted slots for mpirun. For this it might be necssary 
>>>> to have a high number of slots on the headnode for this queue, and request 
>>>> always one slot on this machine in addition to the necessary ones on the 
>>>> computing node.
>>>>
>>>> -- Reuti
>>>>
>>>>
>>>>> Sent from my iPad
>>>>>
>>>>> On Jan 24, 2012, at 5:54 AM, Ilias Miroslav <miroslav.il...@umb.sk> wrote:
>>>>>
>>>>>> Dear experts,
>>>>>>
>>>>>> following http://www.open-mpi.org/faq/?category=building#static-build I 
>>>>>> successfully build static OpenMPI library.
>>>>>> Using such prepared library I succeeded in building parallel static 
>>>>>> executable - dirac.x (ldd dirac.x-not a dynamic executable).
>>>>>>
>>>>>> The problem remains, however,  with the mpirun (orterun) launcher.
>>>>>> While on the local machine, where I compiled both static OpenMPI & 
>>>>>> static dirac.x  I am able to launch parallel job
>>>>>> <OpenMPI_static>/mpirun -np 2 dirac.x ,
>>>>>> I can not lauch it elsewhere, because "mpirun" is dynamically linked, 
>>>>>> thus machine dependent:
>>>>>>
>>>>>> ldd mpirun:
>>>>>>   linux-vdso.so.1 =>  (0x00007fff13792000)
>>>>>>   libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f40f8cab000)
>>>>>>   libnsl.so.1 => /lib/x86_64-linux-gnu/libnsl.so.1 (0x00007f40f8a93000)
>>>>>>   libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f40f888f000)
>>>>>>   libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f40f860d000)
>>>>>>   libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
>>>>>> (0x00007f40f83f1000)
>>>>>>   libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f40f806c000)
>>>>>>   /lib64/ld-linux-x86-64.so.2 (0x00007f40f8ecb000)
>>>>>>
>>>>>> Please how to I build "pure" static mpirun launcher, usable (in my case 
>>>>>> together with static dirac.x) also on other computers  ?
>>>>>>
>>>>>> Thanks, Miro
>>>>>>
>>>>>> --
>>>>>> RNDr. Miroslav Ilia?, PhD.
>>>>>>
>>>>>> Katedra ch?mie
>>>>>> Fakulta pr?rodn?ch vied
>>>>>> Univerzita Mateja Bela
>>>>>> Tajovsk?ho 40
>>>>>> 97400 Bansk? Bystrica
>>>>>> tel: +421 48 446 7351
>>>>>> email : miroslav.il...@umb.sk
>>>>>>
>>>>>> Department of Chemistry
>>>>>> Faculty of Natural Sciences
>>>>>> Matej Bel University
>>>>>> Tajovsk?ho 40
>>>>>> 97400 Banska Bystrica
>>>>>> Slovakia
>>>>>> tel: +421 48 446 7351
>>>>>> email :  miroslav.il...@umb.sk
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> End of users Digest, Vol 2133, Issue 1
>> **************************************
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to