-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I snipped some parts of the exchange and responding to 2 mails in this one. (this may not be proper netiquette on this ML?)
On 06/02/2010 03:54 PM, Jeff Squyres wrote: > What happens if you run: > > ~/openmpi-1.4.2-bin/bin/mpirun --mca btl openib,sm,self ~/bwlat/mpi_helloworld > > (i.e., MX support is still compiled in, but remove MX from the run-time) sadly, exactly the same thing :( it doesn't seems to disable MX (as the Error message is still there, I'm just guessing, as I said I don't really know anything about MPI :-/). $ ~/openmpi-1.4.2-bin/bin/mpirun --mca btl openib,sm,self ~/bwlat/mpi_helloworld [bordeplage-9.bordeaux.grid5000.fr:32664] Error in mx_init (error No MX device entry in /dev.) Hello world from process 0 of 1 [bordeplage-9:32664] *** Process received signal *** [bordeplage-9:32664] Signal: Segmentation fault (11) [bordeplage-9:32664] Signal code: Address not mapped (1) [bordeplage-9:32664] Failing at address: 0x7f8410a1b360 - -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 32664 on node bordeplage-9.bordeaux.grid5000.fr exited on signal 11 (Segmentation fault). - -------------------------------------------------------------------------- > I'm still guessing that there's some weird interaction between the memory > management of those two plugins (MX and verbs). I don't know of anyone else > who has this kind of configuration where it could be tested / debugged. :-( > > Per the above suggestion, let's see what happens if you run without MX and/or > without openib via mpirun command line option. If that fixes the problem, > that would mean you only have to change command line params when you run -- > not have 2 OMPI installs. Additionally, you might be able to leave both > plugins enabled but setenv the OMPI_MCA_memory_ptmalloc2_disable environment > variable to 1; this will disable the OMPI memory management stuff. Note that > this is not a normal MCA parameter -- you cannot set it on the command line > or in a file; it *must* be set as an environment variable (for boring, > technical reasons -- I can explain if you care). I can also confirm that setting the OMPI_MCA_memory_ptmalloc2_disable variable to 1 effectively solves the segfault problem. On 06/02/2010 04:24 PM, Scott Atchley wrote: > Does the same error happen if he tries on a MX host that does not have IB? this node only has a myrinet card, $ mpirun ~/bwlat/mpi_helloworld warning:regcache incompatible with malloc Hello world from process 0 of 1 note that this is with openmpi-1.4.1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJMBnVUAAoJEEzIl7PMEAliLqsIAOfUMffGmBVO2SOadd+roQ3x HuqV6N0lhaevO4D1LPsyE6Q+mUtCWrvDgnIkJoBj0q7zAZvzGKxJM42cVNGFkAUp 3Xaz8oKwW3kZh8JyKLF9+sueuhEeBUhDxjr/25p0P7t2dOP0JeUnscky3hRFipM8 I9zg5LbOi3DusJ6H81nnttNcQYGtrnZSsJxoRfPKZK+51uyNOt9tfgKzzlh2DJBw ddh0OP4cvWoqF3LcLGWBMfebZ16lo9iC8OIZ5xfyvQzVYKXjfX9E25eHH4DARD0j Dc6UOvC3G7oqT4k02AYFmVNNou4423sfJ/27dkX+1+d06A2rb6Npg72ImNPD9Us= =LxwM -----END PGP SIGNATURE-----
smime.p7s
Description: S/MIME Cryptographic Signature