Re: [OMPI devel] trunk hang when nodes have similar but private network
The trunk is [almost] right. It has nice error handling, and a bunch of other features. However, part of this bug report is troubling. We might want to check why it doesn't exhaust all possible addressed before giving up on an endpoint. George. PS: I'm not saying that we should back-port this in the 1.8 ... On Wed, Aug 13, 2014 at 3:33 PM, Jeff Squyres (jsquyres)wrote: > On Aug 13, 2014, at 12:52 PM, George Bosilca wrote: > > > There are many differences between the trunk and 1.8 regarding the TCP > BTL. The major I remember about is that the TCP in the trunk is reporting > errors to the upper level via the callbacks attached to fragments, while > the 1.8 TCP BTL doesn't. > > > > So, I guess that once a connection to a particular endpoint fails, the > trunk is getting the errors reported via the cb and then takes some drastic > measure. In the 1.8 we might fallback and try another IP address before > giving up. > > Does that has any effect on performance? > > I.e., should we bring this change to v1.8? > > Or, put simply: which way is Right? > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15638.php >
Re: [OMPI devel] trunk hang when nodes have similar but private network
Paul: I think this is a slippery slope. As I understand it, these private/on-host IP addresses are generated somewhat randomly (e.g., for on-host VM networking -- I don't know if the IP's for Phi on-host networking are pseudo-random or [effectively] fixed). So you might end up in a situation like this: server A: has br0 on-host IP address 10.0.0.23/8 ***same as server C server B: has br0 on-host IP address 10.0.0.25/8 server C: has br0 on-host IP address 10.0.0.23/8 ***same as server A server D: has br0 on-host IP address 10.0.0.107/8 In this case, servers A and C will detect that they have the same IP. "Ah ha!" they say. "I'll just not use br0, because clearly this is erroneous". But how will servers B and D know this? You'll likely get the same "hang" behavior that we currently have, because B may try to send to A on 10.0.0.23/8. Hence, the additional logic may not actually solve the problem. I'm thinking that this is a human-configuration issue -- there may not be a good way to detect this automatically. ...unless there's a bit in Linux interfaces that says "this is an on-host network". Does that exist? Because that would be a better way to disqualify Linux IP interfaces. On Aug 13, 2014, at 1:57 PM, Paul Hargrovewrote: > I think that in this case one *could* add logic that would disqualify the > subnet because every compute node in the job has the SAME address. In fact, > any subnet on which two or more compute nodes have the same address must be > suspect. If this logic were introduced, the 127.0.0.1 loopback address > wouldn't need to be a special case. > > This is just an observation, not a feature request (at least not on my part). > > -Paul > > > On Wed, Aug 13, 2014 at 7:55 AM, Jeff Squyres (jsquyres) > wrote: > I think this is expected behavior. > > If you have networks that you need Open MPI to ignore (e.g., a private > network that *looks* reachable between multiple servers -- because the > interfaces are on the same subnet -- but actually *isn't*), then the > include/exclude mechanism is the right way to exclude them. > > That being said, I'm not sure why the behavior is different between trunk and > v1.8. > > > On Aug 13, 2014, at 1:41 AM, Gilles Gouaillardet > wrote: > > > Folks, > > > > i noticed mpirun (trunk) hangs when running any mpi program on two nodes > > *and* each node has a private network with the same ip > > (in my case, each node has a private network to a MIC) > > > > in order to reproduce the problem, you can simply run (as root) on the > > two compute nodes > > brctl addbr br0 > > ifconfig br0 192.168.255.1 netmask 255.255.255.0 > > > > mpirun will hang > > > > a workaroung is to add --mca btl_tcp_if_include eth0 > > > > v1.8 does not hang in this case > > > > Cheers, > > > > Gilles > > ___ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > > http://www.open-mpi.org/community/lists/devel/2014/08/15623.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15631.php > > > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15636.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] trunk hang when nodes have similar but private network
On Aug 13, 2014, at 12:52 PM, George Bosilcawrote: > There are many differences between the trunk and 1.8 regarding the TCP BTL. > The major I remember about is that the TCP in the trunk is reporting errors > to the upper level via the callbacks attached to fragments, while the 1.8 TCP > BTL doesn't. > > So, I guess that once a connection to a particular endpoint fails, the trunk > is getting the errors reported via the cb and then takes some drastic > measure. In the 1.8 we might fallback and try another IP address before > giving up. Does that has any effect on performance? I.e., should we bring this change to v1.8? Or, put simply: which way is Right? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] trunk hang when nodes have similar but private network
I think that in this case one *could* add logic that would disqualify the subnet because every compute node in the job has the SAME address. In fact, any subnet on which two or more compute nodes have the same address must be suspect. If this logic were introduced, the 127.0.0.1 loopback address wouldn't need to be a special case. This is just an observation, not a feature request (at least not on my part). -Paul On Wed, Aug 13, 2014 at 7:55 AM, Jeff Squyres (jsquyres)wrote: > I think this is expected behavior. > > If you have networks that you need Open MPI to ignore (e.g., a private > network that *looks* reachable between multiple servers -- because the > interfaces are on the same subnet -- but actually *isn't*), then the > include/exclude mechanism is the right way to exclude them. > > That being said, I'm not sure why the behavior is different between trunk > and v1.8. > > > On Aug 13, 2014, at 1:41 AM, Gilles Gouaillardet < > gilles.gouaillar...@iferc.org> wrote: > > > Folks, > > > > i noticed mpirun (trunk) hangs when running any mpi program on two nodes > > *and* each node has a private network with the same ip > > (in my case, each node has a private network to a MIC) > > > > in order to reproduce the problem, you can simply run (as root) on the > > two compute nodes > > brctl addbr br0 > > ifconfig br0 192.168.255.1 netmask 255.255.255.0 > > > > mpirun will hang > > > > a workaroung is to add --mca btl_tcp_if_include eth0 > > > > v1.8 does not hang in this case > > > > Cheers, > > > > Gilles > > ___ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15623.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15631.php > -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
Re: [OMPI devel] trunk hang when nodes have similar but private network
There are many differences between the trunk and 1.8 regarding the TCP BTL. The major I remember about is that the TCP in the trunk is reporting errors to the upper level via the callbacks attached to fragments, while the 1.8 TCP BTL doesn't. So, I guess that once a connection to a particular endpoint fails, the trunk is getting the errors reported via the cb and then takes some drastic measure. In the 1.8 we might fallback and try another IP address before giving up. George. On Wed, Aug 13, 2014 at 10:55 AM, Jeff Squyres (jsquyres) < jsquy...@cisco.com> wrote: > I think this is expected behavior. > > If you have networks that you need Open MPI to ignore (e.g., a private > network that *looks* reachable between multiple servers -- because the > interfaces are on the same subnet -- but actually *isn't*), then the > include/exclude mechanism is the right way to exclude them. > > That being said, I'm not sure why the behavior is different between trunk > and v1.8. > > > On Aug 13, 2014, at 1:41 AM, Gilles Gouaillardet < > gilles.gouaillar...@iferc.org> wrote: > > > Folks, > > > > i noticed mpirun (trunk) hangs when running any mpi program on two nodes > > *and* each node has a private network with the same ip > > (in my case, each node has a private network to a MIC) > > > > in order to reproduce the problem, you can simply run (as root) on the > > two compute nodes > > brctl addbr br0 > > ifconfig br0 192.168.255.1 netmask 255.255.255.0 > > > > mpirun will hang > > > > a workaroung is to add --mca btl_tcp_if_include eth0 > > > > v1.8 does not hang in this case > > > > Cheers, > > > > Gilles > > ___ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15623.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15631.php >
Re: [OMPI devel] trunk hang when nodes have similar but private network
I think this is expected behavior. If you have networks that you need Open MPI to ignore (e.g., a private network that *looks* reachable between multiple servers -- because the interfaces are on the same subnet -- but actually *isn't*), then the include/exclude mechanism is the right way to exclude them. That being said, I'm not sure why the behavior is different between trunk and v1.8. On Aug 13, 2014, at 1:41 AM, Gilles Gouaillardetwrote: > Folks, > > i noticed mpirun (trunk) hangs when running any mpi program on two nodes > *and* each node has a private network with the same ip > (in my case, each node has a private network to a MIC) > > in order to reproduce the problem, you can simply run (as root) on the > two compute nodes > brctl addbr br0 > ifconfig br0 192.168.255.1 netmask 255.255.255.0 > > mpirun will hang > > a workaroung is to add --mca btl_tcp_if_include eth0 > > v1.8 does not hang in this case > > Cheers, > > Gilles > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15623.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] trunk hang when nodes have similar but private network
Afraid I can't get to this until next week, but will look at it then On Tue, Aug 12, 2014 at 10:41 PM, Gilles Gouaillardet < gilles.gouaillar...@iferc.org> wrote: > Folks, > > i noticed mpirun (trunk) hangs when running any mpi program on two nodes > *and* each node has a private network with the same ip > (in my case, each node has a private network to a MIC) > > in order to reproduce the problem, you can simply run (as root) on the > two compute nodes > brctl addbr br0 > ifconfig br0 192.168.255.1 netmask 255.255.255.0 > > mpirun will hang > > a workaroung is to add --mca btl_tcp_if_include eth0 > > v1.8 does not hang in this case > > Cheers, > > Gilles > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15623.php >
[OMPI devel] trunk hang when nodes have similar but private network
Folks, i noticed mpirun (trunk) hangs when running any mpi program on two nodes *and* each node has a private network with the same ip (in my case, each node has a private network to a MIC) in order to reproduce the problem, you can simply run (as root) on the two compute nodes brctl addbr br0 ifconfig br0 192.168.255.1 netmask 255.255.255.0 mpirun will hang a workaroung is to add --mca btl_tcp_if_include eth0 v1.8 does not hang in this case Cheers, Gilles