Hmmm…Paul, would you be able to try this with the latest trunk tarball? This looks familiar to me, and I wonder if we are just missing a changeset from the trunk that fixed the handshake issues we had with failing over from one transport to another.
Ralph > On Nov 3, 2014, at 7:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Ralph, > > Requested output is attached. > > I have a Linux/x86 system with the same network configuration and will soon > be able to determine if the problem is specific to Solaris. > > -Paul > > > On Mon, Nov 3, 2014 at 7:11 PM, Ralph Castain <rhc.open...@gmail.com > <mailto:rhc.open...@gmail.com>> wrote: > Could you please set -mca oob_base_verbose 20? I’m not sure why the > connection is failing. > > Thanks > Ralph > >> On Nov 3, 2014, at 5:56 PM, Paul Hargrove <phhargr...@lbl.gov >> <mailto:phhargr...@lbl.gov>> wrote: >> >> Not clear if the following failure is Solaris-specific, but it *IS* a >> regression relative to 1.8.3. >> >> The system has 2 IPV4 interfaces: >> Ethernet on 172.16.0.119/16 <http://172.16.0.119/16> >> IPoIB on 172.18.0.119/16 <http://172.18.0.119/16> >> >> $ ifconfig bge0 >> bge0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 index >> 2 >> inet 172.16.0.119 netmask ffff0000 broadcast 172.16.255.255 >> $ ifconfig pFFFF.ibp0 >> pFFFF.ibp0: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> >> mtu 2044 index 3 >> inet 172.18.0.119 netmask ffff0000 broadcast 172.18.255.255 >> >> However, I get a message from mca/oob/tcp about not being able to >> communicate between these two interfaces ON THE SAME NODE: >> >> $ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun -mca >> btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c >> [pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0). >> ------------------------------------------------------------ >> A process or daemon was unable to complete a TCP connection >> to another process: >> Local host: pcp-j-19 >> Remote host: 172.18.0.119 >> This is usually caused by a firewall on the remote host. Please >> check that any firewall (e.g., iptables) has been disabled and >> try again. >> ------------------------------------------------------------ >> >> Let me know what sort of verbose options I should use to gather any >> additional info you may need. >> >> -Paul >> >> On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain <rhc.open...@gmail.com >> <mailto:rhc.open...@gmail.com>> wrote: >> Hi folks >> >> I know 1.8.4 isn’t entirely complete just yet, but I’d like to get a head >> start on the testing so we can release by Fri Nov 7th. So please take a >> little time and test the current tarball: >> >> http://www.open-mpi.org/software/ompi/v1.8/ >> <http://www.open-mpi.org/software/ompi/v1.8/> >> >> Thanks >> Ralph >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org <mailto:de...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php >> <http://www.open-mpi.org/community/lists/devel/2014/10/16138.php> >> >> >> >> -- >> Paul H. Hargrove phhargr...@lbl.gov >> <mailto:phhargr...@lbl.gov> >> Future Technologies Group >> Computer and Data Sciences Department Tel: +1-510-495-2352 >> <tel:%2B1-510-495-2352> >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> <tel:%2B1-510-486-6900>_______________________________________________ >> devel mailing list >> de...@open-mpi.org <mailto:de...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/11/16160.php >> <http://www.open-mpi.org/community/lists/devel/2014/11/16160.php> > > _______________________________________________ > devel mailing list > de...@open-mpi.org <mailto:de...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16161.php > <http://www.open-mpi.org/community/lists/devel/2014/11/16161.php> > > > > -- > Paul H. Hargrove phhargr...@lbl.gov > <mailto:phhargr...@lbl.gov> > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > <oob_base_verbose=20.txt>_______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16162.php