Ralph, You will see from the message I sent a moment ago that -D_REENTRANT on Solaris appears to be the problem. However, I will also try the trunk tarball as you have requested.
-Paul On Mon, Nov 3, 2014 at 8:53 PM, Ralph Castain <rhc.open...@gmail.com> wrote: > Hmmm...Paul, would you be able to try this with the latest trunk tarball? > This looks familiar to me, and I wonder if we are just missing a changeset > from the trunk that fixed the handshake issues we had with failing over > from one transport to another. > > Ralph > > On Nov 3, 2014, at 7:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Ralph, > > Requested output is attached. > > I have a Linux/x86 system with the same network configuration and will > soon be able to determine if the problem is specific to Solaris. > > -Paul > > > On Mon, Nov 3, 2014 at 7:11 PM, Ralph Castain <rhc.open...@gmail.com> > wrote: > >> Could you please set -mca oob_base_verbose 20? I'm not sure why the >> connection is failing. >> >> Thanks >> Ralph >> >> On Nov 3, 2014, at 5:56 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: >> >> Not clear if the following failure is Solaris-specific, but it *IS* a >> regression relative to 1.8.3. >> >> The system has 2 IPV4 interfaces: >> Ethernet on 172.16.0.119/16 >> IPoIB on 172.18.0.119/16 >> >> $ ifconfig bge0 >> bge0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 >> index 2 >> inet 172.16.0.119 netmask ffff0000 broadcast 172.16.255.255 >> $ ifconfig pFFFF.ibp0 >> pFFFF.ibp0: >> flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 2044 >> index 3 >> inet 172.18.0.119 netmask ffff0000 broadcast 172.18.255.255 >> >> However, I get a message from mca/oob/tcp about not being able to >> communicate between these two interfaces ON THE SAME NODE: >> >> $ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun >> -mca btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c >> [pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0). >> ------------------------------------------------------------ >> A process or daemon was unable to complete a TCP connection >> to another process: >> Local host: pcp-j-19 >> Remote host: 172.18.0.119 >> This is usually caused by a firewall on the remote host. Please >> check that any firewall (e.g., iptables) has been disabled and >> try again. >> ------------------------------------------------------------ >> >> Let me know what sort of verbose options I should use to gather any >> additional info you may need. >> >> -Paul >> >> On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain <rhc.open...@gmail.com> >> wrote: >> >>> Hi folks >>> >>> I know 1.8.4 isn't entirely complete just yet, but I'd like to get a >>> head start on the testing so we can release by Fri Nov 7th. So please take >>> a little time and test the current tarball: >>> >>> http://www.open-mpi.org/software/ompi/v1.8/ >>> >>> Thanks >>> Ralph >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php >>> >> >> >> >> -- >> Paul H. Hargrove phhargr...@lbl.gov >> Future Technologies Group >> Computer and Data Sciences Department Tel: +1-510-495-2352 >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/11/16160.php >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/11/16161.php >> > > > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > <oob_base_verbose=20.txt>_______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16162.php > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16163.php > -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900