No problem - glad it was resolved. I have silenced the warning.
> On Mar 20, 2015, at 1:44 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Ralph, > > The ssh message did turn out to be my fault. > With it resolved I now get the TCP_KEEPALIVE warning 4 times instead of once, > but the run proceeds just fine. > > So, this is not a failure - just an undesired warning. > Sorry to have "cried wolf". > > -Paul > > On Fri, Mar 20, 2015 at 12:01 PM, Paul Hargrove <phhargr...@lbl.gov > <mailto:phhargr...@lbl.gov>> wrote: > Ralph, > > Yes, it failed. > Sorry, had meant to include more of the output than I did (see below). > > My Solaris systems moved (physically relocated the disks) yesterday between > what *should* have been essentially identical hardware. At the moment I am > looking into the ssh message, though I am sure I should have all the host > keys associated with the correct hostnames and IPs already. > > -Paul > > full output: > > $ mpirun -mca btl sm,self,verbs -np 2 -host pcp-j-31,pcp-j-35 examples/ring_c' > [pcp-j-35:01400] > [/shared/OMPI/openmpi-master-solaris11-x64-ib-ss12u3/openmpi-dev-1351-gccba8ce/orte/mca/oob/tcp/oob_tcp_common.c:103] > setsockopt(TCP_KEEPALIVE) failed: Option not supported by protocol (99) > ssh_exchange_identification: Connection closed by remote host^M > -------------------------------------------------------------------------- > ORTE was unable to reliably start one or more daemons. > This usually is caused by: > > * not finding the required libraries and/or binaries on > one or more nodes. Please check your PATH and LD_LIBRARY_PATH > settings, or configure OMPI with --enable-orterun-prefix-by-default > > * lack of authority to execute on one or more specified nodes. > Please verify your allocation and authorities. > > * the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). > Please check with your sys admin to determine the correct location to use. > > * compilation of the orted with dynamic libraries when static are required > (e.g., on Cray). Please check your configure cmd line and consider using > one of the contrib/platform definitions for your system type. > > * an inability to create a connection back to mpirun due to a > lack of common network interfaces and/or no route found between > them. Please check network connectivity (including firewalls > and network routing requirements). > -------------------------------------------------------------------------- > > > > > On Fri, Mar 20, 2015 at 7:13 AM, Ralph Castain <r...@open-mpi.org > <mailto:r...@open-mpi.org>> wrote: > Hi Paul > > It should have kept running, albeit with that warning - did the program > actually fail? > > >> On Mar 19, 2015, at 10:05 PM, Paul Hargrove <phhargr...@lbl.gov >> <mailto:phhargr...@lbl.gov>> wrote: >> >> Seen earlier today with last night's master tarball: >> >> $ mpirun -mca btl sm,self,verbs -np 2 -host pcp-j-31,pcp-j-35 >> examples/ring_c' >> [pcp-j-35:01400] >> [/shared/OMPI/openmpi-master-solaris11-x64-ib-ss12u3/openmpi-dev-1351-gccba8ce/orte/mca/oob/tcp/oob_tcp_common.c:103] >> setsockopt(TCP_KEEPALIVE) failed: Option not supported by protocol (99) >> >> -Paul >> >> -- >> Paul H. Hargrove phhargr...@lbl.gov >> <mailto:phhargr...@lbl.gov> >> Computer Languages & Systems Software (CLaSS) Group >> Computer Science Department Tel: +1-510-495-2352 >> <tel:%2B1-510-495-2352> >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> <tel:%2B1-510-486-6900>_______________________________________________ >> devel mailing list >> de...@open-mpi.org <mailto:de...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/03/17138.php >> <http://www.open-mpi.org/community/lists/devel/2015/03/17138.php> > > _______________________________________________ > devel mailing list > de...@open-mpi.org <mailto:de...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/03/17139.php > <http://www.open-mpi.org/community/lists/devel/2015/03/17139.php> > > > > -- > Paul H. Hargrove phhargr...@lbl.gov > <mailto:phhargr...@lbl.gov> > Computer Languages & Systems Software (CLaSS) Group > Computer Science Department Tel: +1-510-495-2352 > <tel:%2B1-510-495-2352> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > <tel:%2B1-510-486-6900> > > > -- > Paul H. Hargrove phhargr...@lbl.gov > <mailto:phhargr...@lbl.gov> > Computer Languages & Systems Software (CLaSS) Group > Computer Science Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > _______________________________________________ > devel mailing list > de...@open-mpi.org <mailto:de...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/03/17143.php > <http://www.open-mpi.org/community/lists/devel/2015/03/17143.php>