Hello,
I figured out the problem, which is described herein, it might be
useful for someone else. The problem stems from ompi_local_slave option
being set on its own in the MPI_Info structure. It seems that
MPI_Info_create is using a shift or more likely a masking operation
(depending upon the size of some type, which in turn depends upon the
underlying architecture), which sets the ompi_local_slave bit to high.
As a result, "jdata->controls" has it's ORTE_JOB_CONTROL_LOCAL_SLAVE bit
set high, see plm_rsh_module.c (line 1065) for the problem. I took the
easy solution and set the ompi_local_slave to "no" in the Info structure
and that solves the problem. Maybe this needs further investigation.
Regards,
On 1/21/11 7:22 PM, Avinash Malik wrote:
Hello,
I have compiled openmpi-1.5.1 as a 32-bit binary on a 64-bit
architecture. I have a problem using MPI_Comm_spawn and
MPI_Comm_spawn_multiple, when MPI_Info is used as a non NULL
(MPI_INFO_NULL) parameter. I get a segmentation fault. I have
the exact same code running fine on a 32-bit machine. I cannot
use the 64-bit openmpi due to problems with other software,
which uses openmpi, but can only be compiled in the 32-bit mode.
I am attaching all the information, in a .tgz file. The .tgz
file consists of:
(1) The c-code for a small example two files parent.c and
child.c
(2) The compile_command that I ran on a 64-bit machine.
(3) The run command to run the system
compiling openmpi-1.5.1.
(4) ompi_info_all
(5) The error that I get, it's a segmentation fault.
Regards,
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users