I think this is expected behavior.

If you have networks that you need Open MPI to ignore (e.g., a private network 
that *looks* reachable between multiple servers -- because the interfaces are 
on the same subnet -- but actually *isn't*), then the include/exclude mechanism 
is the right way to exclude them.

That being said, I'm not sure why the behavior is different between trunk and 
v1.8.


On Aug 13, 2014, at 1:41 AM, Gilles Gouaillardet 
<gilles.gouaillar...@iferc.org> wrote:

> Folks,
> 
> i noticed mpirun (trunk) hangs when running any mpi program on two nodes
> *and* each node has a private network with the same ip
> (in my case, each node has a private network to a MIC)
> 
> in order to reproduce the problem, you can simply run (as root) on the
> two compute nodes
> brctl addbr br0
> ifconfig br0 192.168.255.1 netmask 255.255.255.0
> 
> mpirun will hang
> 
> a workaroung is to add --mca btl_tcp_if_include eth0
> 
> v1.8 does not hang in this case
> 
> Cheers,
> 
> Gilles
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/08/15623.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to