As Jeff says the output of ip addr is critical. Stupid question from me - what is the network topology and type here? Do you have two physical networks?
On Thu, 7 Jul 2022 at 16:11, Jeff Squyres (jsquyres) via users < users@lists.open-mpi.org> wrote: > Can you send the full output of "ifconfig" (or "ip addr") from one of your > compute nodes? > > -- > Jeff Squyres > jsquy...@cisco.com > > ________________________________________ > From: users <users-boun...@lists.open-mpi.org> on behalf of George > Johnson via users <users@lists.open-mpi.org> > Sent: Monday, July 4, 2022 11:06 AM > To: users@lists.open-mpi.org > Cc: George Johnson > Subject: [OMPI users] Multiple IPs on network interface > > Hi, > > I am aware that section 13 in the FAQ says that MPI "in general" wont work > with a network interface that has two IPs. However, I've had slurm running > python and C programs on a cluster of 21 nodes for a while and haven't had > any issues until I tried running some OSU micro benchmarks. This resulted > in this error<https://pastebin.com/uY9TJF9x> , i'm not entirely sure why > each node has two IPs, i believe it is related to netbooting as they are > all netbooted. > > This is the slurm script<https://pastebin.com/vn8nbSxQ> I'm using to > start the job. The -mca section I added to fix the problem however doesn't > do anything as both the ips are on the eth0 interface. > > Is there anything I can do to run these benchmarks? > > Let me know what other details I need to provide as I'm not sure where to > start. > > Thanks, > > George Johnson >