On Wed, Sep 21, 2016 at 9:36 AM, Gilles Gouaillardet <
[email protected]> wrote:
>
> if i want to exclude ib0, i might want to
> mpirun --mca btl_tcp_if_exclude ib0 ...
>
> to me, this is an honest mistake, but with your proposal, i would be
> screwed when
> running on more than one node because i should have
> mpirun --mca btl_tcp_if_exclude ib0,lo ...
My view on this particular honest mistake is that it feels a lot like
failing to include the "self" btl list.
To the best of my knowledge the is no "safety net" for that user mistake.
Instead, there is documentation in README:
- If specified, the "btl" parameter must include the "self"
component, or Open MPI will not be able to deliver messages to the
same rank as the sender. For example: "mpirun --mca btl tcp,self
..."
So, one could/should do he same for btl_tcp_if_exclude.
BUT IT IS ALREADY IN THE README TODAY!
Immediately following the warning above regarding "self" is the following
text:
- If specified, the "btl_tcp_if_exclude" paramater must include the
loopback device ("lo" on many Linux platforms), or Open MPI will
not be able to route MPI messages using the TCP BTL. For example:
"mpirun --mca btl_tcp_if_exclude lo,eth1 ..."
So, in short, there is *already* documentation that tells the user *not* to
do what Gilles is worried about.
-Paul
--
Paul H. Hargrove [email protected]
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
_______________________________________________
devel mailing list
[email protected]
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel