Hmmm…Paul, would you be able to try this with the latest trunk tarball? This 
looks familiar to me, and I wonder if we are just missing a changeset from the 
trunk that fixed the handshake issues we had with failing over from one 
transport to another.

Ralph

> On Nov 3, 2014, at 7:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
> 
> Ralph,
> 
> Requested output is attached.
> 
> I have a Linux/x86 system with the same network configuration and will soon 
> be able to determine if the problem is specific to Solaris.
> 
> -Paul
> 
> 
> On Mon, Nov 3, 2014 at 7:11 PM, Ralph Castain <rhc.open...@gmail.com 
> <mailto:rhc.open...@gmail.com>> wrote:
> Could you please set -mca oob_base_verbose 20? I’m not sure why the 
> connection is failing.
> 
> Thanks
> Ralph
> 
>> On Nov 3, 2014, at 5:56 PM, Paul Hargrove <phhargr...@lbl.gov 
>> <mailto:phhargr...@lbl.gov>> wrote:
>> 
>> Not clear if the following failure is Solaris-specific, but it *IS* a 
>> regression relative to 1.8.3.
>> 
>> The system has 2 IPV4 interfaces:
>>    Ethernet on 172.16.0.119/16 <http://172.16.0.119/16>
>>    IPoIB on 172.18.0.119/16 <http://172.18.0.119/16>
>> 
>> $ ifconfig bge0
>> bge0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 index 
>> 2
>>         inet 172.16.0.119 netmask ffff0000 broadcast 172.16.255.255
>> $ ifconfig pFFFF.ibp0
>> pFFFF.ibp0: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> 
>> mtu 2044 index 3
>>         inet 172.18.0.119 netmask ffff0000 broadcast 172.18.255.255
>> 
>> However, I get a message from mca/oob/tcp about not being able to 
>> communicate between these two interfaces ON THE SAME NODE:
>> 
>> $ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun -mca 
>> btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c
>> [pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0).
>> ------------------------------------------------------------
>> A process or daemon was unable to complete a TCP connection
>> to another process:
>>   Local host:    pcp-j-19
>>   Remote host:   172.18.0.119
>> This is usually caused by a firewall on the remote host. Please
>> check that any firewall (e.g., iptables) has been disabled and
>> try again.
>> ------------------------------------------------------------
>> 
>> Let me know what sort of verbose options I should use to gather any 
>> additional info you may need.
>> 
>> -Paul
>> 
>> On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain <rhc.open...@gmail.com 
>> <mailto:rhc.open...@gmail.com>> wrote:
>> Hi folks
>> 
>> I know 1.8.4 isn’t entirely complete just yet, but I’d like to get a head 
>> start on the testing so we can release by Fri Nov 7th. So please take a 
>> little time and test the current tarball:
>> 
>> http://www.open-mpi.org/software/ompi/v1.8/ 
>> <http://www.open-mpi.org/software/ompi/v1.8/>
>> 
>> Thanks
>> Ralph
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php 
>> <http://www.open-mpi.org/community/lists/devel/2014/10/16138.php>
>> 
>> 
>> 
>> -- 
>> Paul H. Hargrove                          phhargr...@lbl.gov 
>> <mailto:phhargr...@lbl.gov>
>> Future Technologies Group
>> Computer and Data Sciences Department     Tel: +1-510-495-2352 
>> <tel:%2B1-510-495-2352>
>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900 
>> <tel:%2B1-510-486-6900>_______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/11/16160.php 
>> <http://www.open-mpi.org/community/lists/devel/2014/11/16160.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16161.php 
> <http://www.open-mpi.org/community/lists/devel/2014/11/16161.php>
> 
> 
> 
> -- 
> Paul H. Hargrove                          phhargr...@lbl.gov 
> <mailto:phhargr...@lbl.gov>
> Future Technologies Group
> Computer and Data Sciences Department     Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> <oob_base_verbose=20.txt>_______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16162.php

Reply via email to