Re: [OMPI users] Unable to spawn MPI processes on multiple nodes with recent version of OpenMPI

2018-10-06 Thread Andrew Benson
Ok, thanks - that's good to know. -Andrew -- * Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html * Galacticus: http://sites.google.com/site/galacticusmodel On Sat, Oct 6, 2018, 10:02 AM Ralph H Castain wrote: > Just FYI: on master (and perhaps 4.0), child jobs do not

Re: [OMPI users] Unable to spawn MPI processes on multiple nodes with recent version of OpenMPI

2018-10-06 Thread Ralph H Castain
Just FYI: on master (and perhaps 4.0), child jobs do not inherit their parent's mapping policy by default. You have to add “-mca rmaps_base_inherit 1” to your mpirun cmd line. > On Oct 6, 2018, at 10:00 AM, Andrew Benson > wrote: > > Thanks, I'll try this right away. > > Thanks, > Andrew >

Re: [OMPI users] Unable to spawn MPI processes on multiple nodes with recent version of OpenMPI

2018-10-06 Thread Ralph H Castain
Sorry for delay - this should be fixed by https://github.com/open-mpi/ompi/pull/5854 > On Sep 19, 2018, at 8:00 AM, Andrew Benson wrote: > > On further investigation removing the "preconnect_all" option does change the > problem at least. Without "preconnect_all" I no longer see: > >

Re: [OMPI users] Unable to spawn MPI processes on multiple nodes with recent version of OpenMPI

2018-09-19 Thread Andrew Benson
On further investigation removing the "preconnect_all" option does change the problem at least. Without "preconnect_all" I no longer see: -- At least one pair of MPI processes are unable to reach each other for MPI communicat

Re: [OMPI users] Unable to spawn MPI processes on multiple nodes with recent version of OpenMPI

2018-09-19 Thread Andrew Benson
On further investigation removing the "preconnect_all" option does change the problem at least. Without "preconnect_all" I no longer see: -- At least one pair of MPI processes are unable to reach each other for MPI communicat

Re: [OMPI users] Unable to spawn MPI processes on multiple nodes with recent version of OpenMPI

2018-09-16 Thread Andrew Benson
Removing the preconnect_all option didn't resolve the problem unfortunately. I tried changing a few of the other options that I pass to mpirun. What does seem to make a difference is the "--map-by node" option. If I remove that option that my test code runs successfully - the output is in the

Re: [OMPI users] Unable to spawn MPI processes on multiple nodes with recent version of OpenMPI

2018-09-16 Thread Andrew Benson
Thanks - I'll try removing that option. On Sunday, September 16, 2018 7:03:15 AM PDT Ralph H Castain wrote: > I see you are using “preconnect_all” - that is the source of the trouble. I > don’t believe we have tested that option in years and the code is almost > certainly dead. I’d suggest removin

Re: [OMPI users] Unable to spawn MPI processes on multiple nodes with recent version of OpenMPI

2018-09-16 Thread Ralph H Castain
I see you are using “preconnect_all” - that is the source of the trouble. I don’t believe we have tested that option in years and the code is almost certainly dead. I’d suggest removing that option and things should work. > On Sep 15, 2018, at 1:46 PM, Andrew Benson wrote: > > I'm running int

[OMPI users] Unable to spawn MPI processes on multiple nodes with recent version of OpenMPI

2018-09-15 Thread Andrew Benson
I'm running into problems trying to spawn MPI processes across multiple nodes on a cluster using recent versions of OpenMPI. Specifically, using the attached Fortan code, compiled using OpenMPI 3.1.2 with: mpif90 test.F90 -o test.exe and run via a PBS scheduler using the attached test1.pbs, it