-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I'm new to the list and quite new to the world of MPI.
a bit of background: I'm a sysadmin and have to provide a working environment (debian base) for researchers to work with MPI : I'm _NOT_ an open-mpi user - I know C, but that's all. I compile openmpi with the following selectors: --prefix=/usr - --with-openib=/usr --with-mx=/usr (yes, everything goes in /usr) when running an mpi application (any application) on a machine equipped with infiniband hardware, I get a segmentation fault during the MPI_Finalise() the code just runs fine on machines that have no Infiniband devices. <code> #include <stdio.h> #include <mpi.h> int main (int argc,char *argv[]) { int i=0,rank, size; MPI_Init (&argc, &argv); /* starts MPI */ MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */ MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */ while (i == 0) sleep(5); printf( "Hello world from process %d of %d\n", rank, size ); MPI_Finalize(); return 0; } </code> my gdb-fu is quite rusty, but I get the vague idea it happens somewhere in the MPI_Finalize(); (I can probably dig a bit there to find exactly where, if it's relevant) I'm running it with: $ mpirun --mca orte_base_help_aggregate 0 --mca plm_rsh_agent oarsh - -machinefile nodefile ./mpi_helloworld after various tests I've been suggested to try recompiling openmpi with the --without-memory-manager selector. it actually solves the issue and everything runs fine. from what I understand (correct me if I'm wrong) the "memory manager" is used with Infiniband RDMA to have a somewhat persistant memory region available on the device instead of destroying/recreating it everytime. and thus, it's only a "performance tunning" issue, that disables the openmpi "leave_pinned" option? the various questions I have: is this bug/behaviour known? if so, is there a better workaround? as I'm not an openmpi user, I don't really know if it's considered acceptable to have this option disabled? does the list want more details on this bug? thanks, Guillaume Ranquet. Grid5000 support-staff. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJMA59cAAoJEEzIl7PMEAli4EEH/AuR6swdZon43UnPPWt342tS Eyl6KYRR9PHJw0OEhg4BjOIZYHrMlPYBaD7vzTdMJ7uNXw2F12VpsZgcf2YGgpK1 Ww8TwWz18tkG05GUErHph8yA3nskIUsWy2zzuiHxHD5h4v1bEhaZGDdGXTuv3aTE a+9ENTtzSIcI2sXdLHZLjSqlOe2/c6d/mC+9wXGpSx8A48xMyqUegPRcyumIp443 OG1ldSRpICL9FnSrgr3SbF2b7/nlLRDVOC2qmf1SGWw3sP4Bqpda8rKRBvTLAPTk vXC65+SAAXhGXhm6DAA5FKIicqMKe1NdgC4qPnu4jtiHXWL8fADBsjk8h3UReAY= =xENR -----END PGP SIGNATURE-----
smime.p7s
Description: S/MIME Cryptographic Signature