I keep explaining that we don't "discard" anything, but there really isn't any
point to continuing trying to explain the system. With the announced intention
of completing the move of the BTLs to OPAL, I no longer need the multi-module
complexity in the OOB/TCP. So I have removed it and gone bac
Jeff,
as pointed by Ralph, i do wish using eth0 for oob messages.
i work on a 4k+ nodes cluster with a very decent gigabit ethernet
network (reasonable oversubscription + switches
from a reputable vendor you are familiar with ;-) )
my experience is that IPoIB can be very slow at establishing a
co
On Jun 5, 2014, at 7:09 AM, Ralph Castain wrote:
> Okay, before you go chasing this, let me explain that we already try to
> address this issue in the TCP oob. When we need to connect to someone, we do
> the following:
>
> 1. if we have a direct connection available, we hand the message to th
Coll/ml does disqualify itself if processes are not bound. The problem here is
there is an inconsistency between the two sides of the intercommunicator. I can
write a quick fix for 1.8.2.
-Nathan
From: devel [devel-boun...@open-mpi.org] on behalf of Gille
Okay, before you go chasing this, let me explain that we already try to address
this issue in the TCP oob. When we need to connect to someone, we do the
following:
1. if we have a direct connection available, we hand the message to the
software module assigned to that NIC
2. if none of the ava
Because Gilles wants to avoid using IB for TCP messages, and using eth0 also
solves the problem (the messages just route)
On Jun 5, 2014, at 5:00 AM, Jeff Squyres (jsquyres) wrote:
> Another random thought for Gilles situation: why not oob-TCP-if-include ib0?
> (And not eth0)
>
> That should
Another random thought for Gilles situation: why not oob-TCP-if-include ib0?
(And not eth0)
That should solve his problem, but not the larger issue I raised in my previous
email.
Sent from my phone. No type good.
On Jun 4, 2014, at 9:32 PM, "Gilles Gouaillardet"
mailto:gilles.gouaillar...@gm
That raises a larger issue -- what about Ethernet-only clusters that span
multiple IP/L3 subnets? This is a scenario that Cisco definitely wants to
enable/support.
The usnic BTL, for example, can handle this scenario. We hadn't previously
considered the TCP oob component effects in this scena
Folks,
on my single socket four cores VM (no batch manager), i am running the
intercomm_create test from the ibm test suite.
mpirun -np 1 ./intercomm_create
=> OK
mpirun -np 2 ./intercomm_create
=> HANG :-(
mpirun -np 2 --mca coll ^ml ./intercomm_create
=> OK
basically, this first two tasks w
WHAT:Open our low-level communication infrastructure by moving all
necessary components
(btl/rcache/allocator/mpool) down in OPAL
WHY: All the components required for inter-process communications are
currently deeply integrated in the OMPI
layer. Several groups/ins
10 matches
Mail list logo