Hi,

I encountered the same SEGV reported on the users list when
running varList program.

  http://www.open-mpi.org/community/lists/users/2014/07/24792.php

mpiexec -n 1 ./varList:
----------------------------------------------------------------
... snip ...
event                                             U/D-2 CHAR   n/a      ALL
event_base_verbose                                D/D-8 INT    n/a      LOCAL   
 0
event_libevent2021_event_include                  U/A-3 CHAR   n/a      LOCAL   
 poll
opal_event_include                                U/A-3 CHAR   n/a      LOCAL   
 poll
event_libevent2021_major_version                  D/A-9 INT    n/a      UNKNOWN 
 1
event_libevent2021_minor_version                  D/A-9 INT    n/a      UNKNOWN 
 9
event_libevent2021_release_version                D/A-9 INT    n/a      UNKNOWN 
 0
shmem                                             U/D-2 CHAR   n/a      ALL
shmem_base_verbose                                D/D-8 INT    n/a      LOCAL   
 0
shmem_base_RUNTIME_QUERY_hint                     D/A-9 CHAR   n/a      ALL-EQ
shmem_mmap_priority                               U/A-3 INT    n/a      ALL     
 50
shmem_mmap_enable_nfs_warning                     D/A-9 INT    n/a      LOCAL   
 true
shmem_mmap_relocate_backing_file                  D/A-9 INT    n/a      ALL     
 0
shmem_mmap_backing_file_base_dir                  D/A-9 CHAR   n/a      ALL     
 /dev/shm
shmem_mmap_major_version                          D/A-9 INT    n/a      UNKNOWN 
 1
shmem_mmap_minor_version                          D/A-9 INT    n/a      UNKNOWN 
 9
shmem_mmap_release_version                        D/A-9 INT    n/a      UNKNOWN 
 0
shmem_posix_major_version                         D/A-9 INT    n/a      UNKNOWN 
 1201644720
shmem_posix_minor_version                         D/A-9 INT    n/a      UNKNOWN 
 32756
shmem_posix_release_version                       D/A-9 INT    n/a      UNKNOWN 
 6
[ppc:12688] *** Process received signal ***
[ppc:12688] Signal: Segmentation fault (11)
[ppc:12688] Signal code: Invalid permissions (2)
[ppc:12688] Failing at address: 0x7ff4479f83d8
[ppc:12688] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x325c0)[0x7ff4493015c0]
[ppc:12688] [ 1] 
/home/rivis/opt/openmpi-trunk-debug/lib/libmpi.so.0(PMPI_T_cvar_read+0xbc)[0x7ff44970abb7]
[ppc:12688] [ 2] ./varlist(list_cvars+0x56a)[0x4029bc]
[ppc:12688] [ 3] ./varlist(main+0x42b)[0x403598]
[ppc:12688] [ 4] 
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)[0x7ff4492edeed]
[ppc:12688] [ 5] ./varlist[0x4016c9]
[ppc:12688] *** End of error message ***
        ----------------------------------------------------------------

I tracked this error and found that this seems related to DSO.

The error occurs when accessing value->intval for the
control variable shmem_sysv_major_version in MPI_T_cvar_read.

  https://svn.open-mpi.org/trac/ompi/browser/trunk/ompi/mpi/tool/cvar_read.c

The 'value' was gotten by mca_base_var_get_value and it points
mca_shmem_sysv_component.super.base_version.mca_component_major_version,
which was dlclose'd in MPI_INIT for DSO.
(component mmap is selected on my environment)

Abnormal shmem_posix_{major,minor,relase}_version values in
my output above are the same reason. SEGV occurs if the memory
was returned to kernel, and abnormal values are printed
if not yet.

So this SEGV doesn't occur if I configure Open MPI with
--disable-dlopen option. I think it's the reason why Nathan
doesn't see this error.

Regards,
KAWASHIMA Takahiro

Reply via email to