Re: [OMPI users] mpirun issue using more than 64 hosts

2018-02-12 Thread Adam Sylvester
A... thanks Gilles. That makes sense. I was stuck thinking there was an ssh problem on rank 0; it never occurred to me mpirun was doing something clever there and that those ssh errors were from a different instance altogether. It's no problem to put my private key on all instances - I'll go

Re: [OMPI users] mpirun issue using more than 64 hosts

2018-02-12 Thread Gilles Gouaillardet
Adam, by default, when more than 64 hosts are involved, mpirun uses a tree spawn in order to remote launch the orted daemons. That means you have two options here : - allow all compute nodes to ssh each other (e.g. the ssh private key of *all* the nodes should be in *all* the authorized_keys -

[OMPI users] mpirun issue using more than 64 hosts

2018-02-12 Thread Adam Sylvester
I'm running OpenMPI 2.1.0, built from source, on RHEL 7. I'm using the default ssh-based launcher, where I have my private ssh key on rank 0 and the associated public key on all ranks. I create a hosts file with a list of unique IPs, with the host that I'm running mpirun from on the first line, a