This is the a knowing issue,
        https://svn.open-mpi.org/trac/ompi/ticket/2087
Maybe it's priority should be raised up.
Lenny.

On Wed, Dec 30, 2009 at 12:13 PM, Daniel Spångberg <dani...@mkem.uu.se>wrote:

> Dear OpenMPI list,
>
> I have used the dynamic rules for collectives to be able to select one
> specific algorithm. With the latest versions of openmpi this seems to be
> broken. Just enabling coll_tuned_use_dynamic_rules causes the code to
> segfault. However, I do not provide a file with rules, since I just want to
> modify the behavior of one routine.
>
> I have tried the below example code on openmpi 1.3.2, 1.3.3, 1.3.4, and
> 1.4. It *works* on 1.3.2, 1.3.3, but segfaults on 1.3.4 and 1.4. I have
> confirmed this on Scientific Linux 5.2, and 5.4. I have also successfully
> reproduced the crash using version 1.4 running on debian etch. All running
> on amd64, compiled from source without other options to configure than
> --prefix. The crash occurs whether I use the intel 11.1 compiler (via env
> CC) or gcc. It also occurs no matter the btl is set to openib,self tcp,self
> sm,self or combinations of those. See below for ompi_info and other info. I
> have tried MPI_Alltoall, MPI_Alltoallv, and MPI_Allreduce which behave the
> same.
>
> #include <stdlib.h>
> #include <mpi.h>
>


>
> int main(int argc, char **argv)
> {
>  int rank,size;
>  char *buffer, *buffer2;
>
>  MPI_Init(&argc,&argv);
>
>  MPI_Comm_size(MPI_COMM_WORLD,&size);
>  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
>
>  buffer=calloc(100*size,1);
>  buffer2=calloc(100*size,1);
>
>  MPI_Alltoall(buffer,100,MPI_BYTE,buffer2,100,MPI_BYTE,MPI_COMM_WORLD);
>
>  MPI_Finalize();
>  return 0;
> }
>
> Demonstrated behaviour:
>
> $ ompi_info
>                 Package: Open MPI daniels@arthur Distribution
>                Open MPI: 1.4
>   Open MPI SVN revision: r22285
>   Open MPI release date: Dec 08, 2009
>                Open RTE: 1.4
>   Open RTE SVN revision: r22285
>   Open RTE release date: Dec 08, 2009
>                    OPAL: 1.4
>       OPAL SVN revision: r22285
>       OPAL release date: Dec 08, 2009
>            Ident string: 1.4
>                  Prefix:
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install
>  Configured architecture: x86_64-unknown-linux-gnu
>          Configure host: arthur
>           Configured by: daniels
>           Configured on: Tue Dec 29 16:54:37 CET 2009
>          Configure host: arthur
>                Built by: daniels
>                Built on: Tue Dec 29 17:04:36 CET 2009
>              Built host: arthur
>              C bindings: yes
>            C++ bindings: yes
>      Fortran77 bindings: yes (all)
>      Fortran90 bindings: yes
>  Fortran90 bindings size: small
>              C compiler: gcc
>     C compiler absolute: /usr/bin/gcc
>            C++ compiler: g++
>   C++ compiler absolute: /usr/bin/g++
>      Fortran77 compiler: gfortran
>  Fortran77 compiler abs: /usr/bin/gfortran
>      Fortran90 compiler: gfortran
>  Fortran90 compiler abs: /usr/bin/gfortran
>             C profiling: yes
>           C++ profiling: yes
>     Fortran77 profiling: yes
>     Fortran90 profiling: yes
>          C++ exceptions: no
>          Thread support: posix (mpi: no, progress: no)
>           Sparse Groups: no
>  Internal debug support: no
>     MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
>         libltdl support: yes
>   Heterogeneous support: no
>  mpirun default --prefix: no
>         MPI I/O support: yes
>       MPI_WTIME support: gettimeofday
> Symbol visibility support: yes
>   FT Checkpoint support: no  (checkpoint thread: no)
>           MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.4)
>              MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.4)
>           MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.4)
>
>               MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.4)
>               MCA carto: file (MCA v2.0, API v2.0, Component v1.4)
>           MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.4)
>               MCA timer: linux (MCA v2.0, API v2.0, Component v1.4)
>         MCA installdirs: env (MCA v2.0, API v2.0, Component v1.4)
>         MCA installdirs: config (MCA v2.0, API v2.0, Component v1.4)
>                 MCA dpm: orte (MCA v2.0, API v2.0, Component v1.4)
>              MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.4)
>           MCA allocator: basic (MCA v2.0, API v2.0, Component v1.4)
>           MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.4)
>                MCA coll: basic (MCA v2.0, API v2.0, Component v1.4)
>                MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.4)
>                MCA coll: inter (MCA v2.0, API v2.0, Component v1.4)
>                MCA coll: self (MCA v2.0, API v2.0, Component v1.4)
>                MCA coll: sm (MCA v2.0, API v2.0, Component v1.4)
>                MCA coll: sync (MCA v2.0, API v2.0, Component v1.4)
>                MCA coll: tuned (MCA v2.0, API v2.0, Component v1.4)
>                  MCA io: romio (MCA v2.0, API v2.0, Component v1.4)
>               MCA mpool: fake (MCA v2.0, API v2.0, Component v1.4)
>               MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.4)
>               MCA mpool: sm (MCA v2.0, API v2.0, Component v1.4)
>                 MCA pml: cm (MCA v2.0, API v2.0, Component v1.4)
>                 MCA pml: csum (MCA v2.0, API v2.0, Component v1.4)
>                 MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.4)
>                 MCA pml: v (MCA v2.0, API v2.0, Component v1.4)
>                 MCA bml: r2 (MCA v2.0, API v2.0, Component v1.4)
>              MCA rcache: vma (MCA v2.0, API v2.0, Component v1.4)
>                 MCA btl: self (MCA v2.0, API v2.0, Component v1.4)
>                 MCA btl: sm (MCA v2.0, API v2.0, Component v1.4)
>                 MCA btl: tcp (MCA v2.0, API v2.0, Component v1.4)
>                MCA topo: unity (MCA v2.0, API v2.0, Component v1.4)
>                 MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.4)
>                 MCA osc: rdma (MCA v2.0, API v2.0, Component v1.4)
>                 MCA iof: hnp (MCA v2.0, API v2.0, Component v1.4)
>                 MCA iof: orted (MCA v2.0, API v2.0, Component v1.4)
>                 MCA iof: tool (MCA v2.0, API v2.0, Component v1.4)
>                 MCA oob: tcp (MCA v2.0, API v2.0, Component v1.4)
>                MCA odls: default (MCA v2.0, API v2.0, Component v1.4)
>                 MCA ras: slurm (MCA v2.0, API v2.0, Component v1.4)
>               MCA rmaps: load_balance (MCA v2.0, API v2.0, Component v1.4)
>               MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.4)
>               MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.4)
>               MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.4)
>                 MCA rml: oob (MCA v2.0, API v2.0, Component v1.4)
>              MCA routed: binomial (MCA v2.0, API v2.0, Component v1.4)
>              MCA routed: direct (MCA v2.0, API v2.0, Component v1.4)
>              MCA routed: linear (MCA v2.0, API v2.0, Component v1.4)
>                 MCA plm: rsh (MCA v2.0, API v2.0, Component v1.4)
>                 MCA plm: slurm (MCA v2.0, API v2.0, Component v1.4)
>               MCA filem: rsh (MCA v2.0, API v2.0, Component v1.4)
>              MCA errmgr: default (MCA v2.0, API v2.0, Component v1.4)
>                 MCA ess: env (MCA v2.0, API v2.0, Component v1.4)
>                 MCA ess: hnp (MCA v2.0, API v2.0, Component v1.4)
>                 MCA ess: singleton (MCA v2.0, API v2.0, Component v1.4)
>                 MCA ess: slurm (MCA v2.0, API v2.0, Component v1.4)
>                 MCA ess: tool (MCA v2.0, API v2.0, Component v1.4)
>             MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.4)
>             MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.4)
>
> $ mpicc -O2 -o bug_openmpi_1.4_test bug_openmpi_1.4_test.c
> $ ldd ./bug_openmpi_1.4_test
>        libmpi.so.0 =>
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libmpi.so.0
> (0x00002b33fa57e000)
>        libopen-rte.so.0 =>
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libopen-rte.so.0
> (0x00002b33fa821000)
>        libopen-pal.so.0 =>
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libopen-pal.so.0
> (0x00002b33faa6b000)
>        libdl.so.2 => /lib64/libdl.so.2 (0x00000032c7400000)
>        libnsl.so.1 => /lib64/libnsl.so.1 (0x00000032cfe00000)
>        libutil.so.1 => /lib64/libutil.so.1 (0x00000032d4a00000)
>        libm.so.6 => /lib64/libm.so.6 (0x00000032c7000000)
>        libpthread.so.0 => /lib64/libpthread.so.0 (0x00000032c7800000)
>        libc.so.6 => /lib64/libc.so.6 (0x00000032c6c00000)
>        /lib64/ld-linux-x86-64.so.2 (0x00000032c5c00000)
> $ mpirun -mca btl tcp,self -mca coll_tuned_use_dynamic_rules 0 -np 8
> ./bug_openmpi_1.4_test
> $ mpirun -mca btl tcp,self -mca coll_tuned_use_dynamic_rules 1 -np 8
> ./bug_openmpi_1.4_test
> [girasole:27510] *** Process received signal ***
> [girasole:27510] Signal: Segmentation fault (11)
> [girasole:27510] Signal code:  (128)
> [girasole:27510] Failing at address: (nil)
> [girasole:27503] *** Process received signal ***
> [girasole:27503] Signal: Segmentation fault (11)
> [girasole:27503] Signal code:  (128)
> [girasole:27503] Failing at address: (nil)
> [girasole:27510] [ 0] /lib64/libpthread.so.0 [0x32c780de80]
> [girasole:27510] [ 1]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2ae2b29fbeb5]
> [girasole:27510] [ 2]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2ae2b29fa8ca]
> [girasole:27510] [ 3]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libmpi.so.0(MPI_Alltoall+0x15f)
> [0x2ae2ae76bbff]
> [girasole:27510] [ 4] ./bug_openmpi_1.4_test(main+0x97) [0x4009b7]
> [girasole:27510] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x32c6c1d8b4]
> [girasole:27510] [ 6] ./bug_openmpi_1.4_test [0x400869]
> [girasole:27510] *** End of error message ***
> [girasole:27503] [ 0] /lib64/libpthread.so.0 [0x32c780de80]
> [girasole:27503] [ 1]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b534b1b6eb5]
> [girasole:27503] [ 2]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b534b1b58ca]
> [girasole:27503] [ 3]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libmpi.so.0(MPI_Alltoall+0x15f)
> [0x2b5346f26bff]
> [girasole:27503] [ 4] ./bug_openmpi_1.4_test(main+0x97) [0x4009b7]
> [girasole:27503] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x32c6c1d8b4]
> [girasole:27503] [ 6] ./bug_openmpi_1.4_test [0x400869]
> [girasole:27503] *** End of error message ***
> [girasole:27505] *** Process received signal ***
> [girasole:27505] Signal: Segmentation fault (11)
> [girasole:27505] Signal code:  (128)
> [girasole:27505] Failing at address: (nil)
> [girasole:27509] *** Process received signal ***
> [girasole:27509] Signal: Segmentation fault (11)
> [girasole:27509] Signal code:  (128)
> [girasole:27509] Failing at address: (nil)
> [girasole:27505] [ 0] /lib64/libpthread.so.0 [0x32c780de80]
> [girasole:27505] [ 1]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2ab662aa0eb5]
> [girasole:27505] [ 2]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2ab662a9f8ca]
> [girasole:27505] [ 3]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libmpi.so.0(MPI_Alltoall+0x15f)
> [0x2ab65e810bff]
> [girasole:27505] [ 4] ./bug_openmpi_1.4_test(main+0x97) [0x4009b7]
> [girasole:27505] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x32c6c1d8b4]
> [girasole:27505] [ 6] ./bug_openmpi_1.4_test [0x400869]
> [girasole:27505] *** End of error message ***
> [girasole:27507] *** Process received signal ***
> [girasole:27507] Signal: Segmentation fault (11)
> [girasole:27507] Signal code:  (128)
> [girasole:27507] Failing at address: (nil)
> [girasole:27509] [ 0] /lib64/libpthread.so.0 [0x32c780de80]
> [girasole:27509] [ 1]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b7dc1863eb5]
> [girasole:27509] [ 2]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b7dc18628ca]
> [girasole:27509] [ 3]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libmpi.so.0(MPI_Alltoall+0x15f)
> [0x2b7dbd5d3bff]
> [girasole:27509] [ 4] ./bug_openmpi_1.4_test(main+0x97) [0x4009b7]
> [girasole:27509] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x32c6c1d8b4]
> [girasole:27509] [ 6] ./bug_openmpi_1.4_test [0x400869]
> [girasole:27509] *** End of error message ***
> [girasole:27507] [ 0] /lib64/libpthread.so.0 [0x32c780de80]
> [girasole:27507] [ 1]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b09eb873eb5]
> [girasole:27507] [ 2]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b09eb8728ca]
> [girasole:27507] [ 3]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libmpi.so.0(MPI_Alltoall+0x15f)
> [0x2b09e75e3bff]
> [girasole:27507] [ 4] ./bug_openmpi_1.4_test(main+0x97) [0x4009b7]
> [girasole:27507] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x32c6c1d8b4]
> [girasole:27507] [ 6] ./bug_openmpi_1.4_test [0x400869]
> [girasole:27507] *** End of error message ***
> [girasole:27504] *** Process received signal ***
> [girasole:27504] Signal: Segmentation fault (11)
> [girasole:27504] Signal code:  (128)
> [girasole:27504] Failing at address: (nil)
> [girasole:27506] *** Process received signal ***
> [girasole:27506] Signal: Segmentation fault (11)
> [girasole:27506] Signal code:  (128)
> [girasole:27506] Failing at address: (nil)
> [girasole:27504] [ 0] /lib64/libpthread.so.0 [0x32c780de80]
> [girasole:27504] [ 1]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b6fde1afeb5]
> [girasole:27504] [ 2]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b6fde1ae8ca]
> [girasole:27504] [ 3]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libmpi.so.0(MPI_Alltoall+0x15f)
> [0x2b6fd9f1fbff]
> [girasole:27504] [ 4] ./bug_openmpi_1.4_test(main+0x97) [0x4009b7]
> [girasole:27504] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x32c6c1d8b4]
> [girasole:27504] [ 6] ./bug_openmpi_1.4_test [0x400869]
> [girasole:27504] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 7 with PID 27510 on node girasole exited
> on signal 11 (Segmentation fault).
> --------------------------------------------------------------------------
> [girasole:27506] [ 0] /lib64/libpthread.so.0 [0x32c780de80]
> [girasole:27506] [ 1]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b66f2908eb5]
> [girasole:27506] [ 2]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b66f29078ca]
> [girasole:27506] [ 3]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libmpi.so.0(MPI_Alltoall+0x15f)
> [0x2b66ee678bff]
> [girasole:27506] [ 4] ./bug_openmpi_1.4_test(main+0x97) [0x4009b7]
> [girasole:27506] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x32c6c1d8b4]
> [girasole:27506] [ 6] ./bug_openmpi_1.4_test [0x400869]
> [girasole:27506] *** End of error message ***
> [girasole:27508] *** Process received signal ***
> [girasole:27508] Signal: Segmentation fault (11)
> [girasole:27508] Signal code:  (128)
> [girasole:27508] Failing at address: (nil)
> [girasole:27508] [ 0] /lib64/libpthread.so.0 [0x32c780de80]
> [girasole:27508] [ 1]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b89b09a1eb5]
> [girasole:27508] [ 2]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so
> [0x2b89b09a08ca]
> [girasole:27508] [ 3]
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libmpi.so.0(MPI_Alltoall+0x15f)
> [0x2b89ac711bff]
> [girasole:27508] [ 4] ./bug_openmpi_1.4_test(main+0x97) [0x4009b7]
> [girasole:27508] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x32c6c1d8b4]
> [girasole:27508] [ 6] ./bug_openmpi_1.4_test [0x400869]
> [girasole:27508] *** End of error message ***
>
>
> Best regards,
>
> --
> Daniel Spångberg
> Materialkemi
> Uppsala Universitet
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to