Hello, It seems that a return value is not updated during the setup of process affinity in function ompi_mpi_init() ompi/runtime/ompi_mpi_init.c:459
The problem is in the following piece of code: [... here ret == OPAL_SUCCESS ...] phys_cpu = opal_paffinity_base_get_physical_processor_id(nrank); if (0 > phys_cpu) { error = "Could not get physical processor id - cannot set processor affinity"; goto error; } [...] If opal_paffinity_base_get_physical_processor_id() failed ret is not updated and we will reach the "error:" label while ret == OPAL_SUCCESS. As a result MPI_Init() will return without having initialized the MPI_COMM_WORLD struct leading to a segmentation fault on calls like MPI_Comm_size(). I got the bug recently with new westmere processors for which the function opal_paffinity_base_get_physical_processor_id() failed if we are using the mca parameter "opal_paffinity_alone 1" during the execution. I'm not sure that it's the right way to fix the problem but here is a patch tested with v1.5. This patch allows to report the problem instead of generating a segmentation fault. With the patch, the output is: -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): Could not get physical processor id - cannot set processor affinity --> Returned "Not found" (-5) instead of "Success" (0) -------------------------------------------------------------------------- Without the patch, the output was: *** Process received signal *** Signal: Segmentation fault (11) Signal code: Address not mapped (1) Failing at address: 0x10 [ 0] /lib64/libpthread.so.0 [0x3d4e20ee90] [ 1] /home_nfs/thouveng/dev/openmpi-v1.5/lib/libmpi.so.0(MPI_Comm_size+0x9c) [0x7fce74468dfc] [ 2] ./IMB-MPI1(IMB_init_pointers+0x2f) [0x40629f] [ 3] ./IMB-MPI1(main+0x65) [0x4035c5] [ 4] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3d4da1ea2d] [ 5] ./IMB-MPI1 [0x403499] Regards, Guillaume --- diff --git a/ompi/runtime/ompi_mpi_init.c b/ompi/runtime/ompi_mpi_init.c --- a/ompi/runtime/ompi_mpi_init.c +++ b/ompi/runtime/ompi_mpi_init.c @@ -459,6 +459,7 @@ int ompi_mpi_init(int argc, char **argv, OPAL_PAFFINITY_CPU_ZERO(mask); phys_cpu = opal_paffinity_base_get_physical_processor_id(nrank); if (0 > phys_cpu) { + ret = phys_cpu; error = "Could not get physical processor id - cannot set processor affinity"; goto error; }