Thanks, Ralph!
When I add --mca oob_tcp_if_include ib0 (where ib0 is infiniband interface from
ifconfig) to mpirun it starts working correct!
Why OpenMPI doesn't do it itself?
Tue, 22 Jul 2014 11:26:16 -0700 от Ralph Castain :
>Okay, the problem is that the connection back to mpirun isn't getti
It's supposed to, so it sounds like we have a bug in the connection failover
mechanism. I'll address it
On Jul 23, 2014, at 1:21 AM, Timur Ismagilov wrote:
> Thanks, Ralph!
> When I add --mca oob_tcp_if_include ib0 (where ib0 is infiniband interface
> from ifconfig) to mpirun it starts working
It seems that the network was not consistenly wired.
Port DOWN means that the port was not wired (or bad cable). Moreover, on some
nodes port 1 is connected on other port 2.
My concern is that they are not connected to the same subnet. If you have at
least one port on each node connected to the s
Ahsan,
This link might be helpful in trying to diagnose and treat IB fabric issues:
http://docs.oracle.com/cd/E18476_01/doc.220/e18478/fabric.htm#CIHIHJGD
You might try resetting the problematic port, or just use port 2 for your
jobs as a quick workaround:
-mca btl_openib_if_include mlx4_0:2
J
Hi,
today I installed openmpi-1.8.2rc2r32288 on my machines (Solaris 10
Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with
Sun C 5.12 and gcc-4.9.0. Unfortunately I have problems with both
compilers on "Solaris 10 Sparc". My small program works as expected
on "Solaris 10 x86_64" and Li