No problem - glad it was resolved. I have silenced the warning.

> On Mar 20, 2015, at 1:44 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
> 
> Ralph,
> 
> The ssh message did turn out to be my fault.
> With it resolved I now get the TCP_KEEPALIVE warning 4 times instead of once, 
> but the run proceeds just fine.
> 
> So, this is not a failure - just an undesired warning.
> Sorry to have "cried wolf".
> 
> -Paul
> 
> On Fri, Mar 20, 2015 at 12:01 PM, Paul Hargrove <phhargr...@lbl.gov 
> <mailto:phhargr...@lbl.gov>> wrote:
> Ralph,
> 
> Yes, it failed.
> Sorry, had meant to include more of the output than I did (see below).
> 
> My Solaris systems moved (physically relocated the disks) yesterday between 
> what *should* have been essentially identical hardware.  At the moment I am 
> looking into the ssh message, though I am sure I should have all the host 
> keys associated with the correct hostnames and IPs already.
> 
> -Paul
> 
> full output:
> 
> $ mpirun -mca btl sm,self,verbs -np 2 -host pcp-j-31,pcp-j-35 examples/ring_c'
> [pcp-j-35:01400] 
> [/shared/OMPI/openmpi-master-solaris11-x64-ib-ss12u3/openmpi-dev-1351-gccba8ce/orte/mca/oob/tcp/oob_tcp_common.c:103]
>  setsockopt(TCP_KEEPALIVE) failed: Option not supported by protocol (99)
> ssh_exchange_identification: Connection closed by remote host^M
> --------------------------------------------------------------------------
> ORTE was unable to reliably start one or more daemons.
> This usually is caused by:
> 
> * not finding the required libraries and/or binaries on
>   one or more nodes. Please check your PATH and LD_LIBRARY_PATH
>   settings, or configure OMPI with --enable-orterun-prefix-by-default
> 
> * lack of authority to execute on one or more specified nodes.
>   Please verify your allocation and authorities.
> 
> * the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
>   Please check with your sys admin to determine the correct location to use.
> 
> *  compilation of the orted with dynamic libraries when static are required
>   (e.g., on Cray). Please check your configure cmd line and consider using
>   one of the contrib/platform definitions for your system type.
> 
> * an inability to create a connection back to mpirun due to a
>   lack of common network interfaces and/or no route found between
>   them. Please check network connectivity (including firewalls
>   and network routing requirements).
> --------------------------------------------------------------------------
> 
> 
> 
> 
> On Fri, Mar 20, 2015 at 7:13 AM, Ralph Castain <r...@open-mpi.org 
> <mailto:r...@open-mpi.org>> wrote:
> Hi Paul
> 
> It should have kept running, albeit with that warning - did the program 
> actually fail?
> 
> 
>> On Mar 19, 2015, at 10:05 PM, Paul Hargrove <phhargr...@lbl.gov 
>> <mailto:phhargr...@lbl.gov>> wrote:
>> 
>> Seen earlier today with last night's master tarball:
>> 
>> $ mpirun -mca btl sm,self,verbs -np 2 -host pcp-j-31,pcp-j-35 
>> examples/ring_c'
>> [pcp-j-35:01400] 
>> [/shared/OMPI/openmpi-master-solaris11-x64-ib-ss12u3/openmpi-dev-1351-gccba8ce/orte/mca/oob/tcp/oob_tcp_common.c:103]
>>  setsockopt(TCP_KEEPALIVE) failed: Option not supported by protocol (99)
>> 
>> -Paul
>> 
>> -- 
>> Paul H. Hargrove                          phhargr...@lbl.gov 
>> <mailto:phhargr...@lbl.gov>
>> Computer Languages & Systems Software (CLaSS) Group
>> Computer Science Department               Tel: +1-510-495-2352 
>> <tel:%2B1-510-495-2352>
>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900 
>> <tel:%2B1-510-486-6900>_______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2015/03/17138.php 
>> <http://www.open-mpi.org/community/lists/devel/2015/03/17138.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/03/17139.php 
> <http://www.open-mpi.org/community/lists/devel/2015/03/17139.php>
> 
> 
> 
> -- 
> Paul H. Hargrove                          phhargr...@lbl.gov 
> <mailto:phhargr...@lbl.gov>
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department               Tel: +1-510-495-2352 
> <tel:%2B1-510-495-2352>
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900 
> <tel:%2B1-510-486-6900>
> 
> 
> -- 
> Paul H. Hargrove                          phhargr...@lbl.gov 
> <mailto:phhargr...@lbl.gov>
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department               Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/03/17143.php 
> <http://www.open-mpi.org/community/lists/devel/2015/03/17143.php>

Reply via email to