Re: [OMPI devel] Master failure of oob_tcp on Solaris

2015-03-20 Thread Ralph Castain
No problem - glad it was resolved. I have silenced the warning. > On Mar 20, 2015, at 1:44 PM, Paul Hargrove wrote: > > Ralph, > > The ssh message did turn out to be my fault. > With it resolved I now get the TCP_KEEPALIVE warning 4 times instead of once, > but the run proceeds just fine. > >

Re: [OMPI devel] Master failure of oob_tcp on Solaris

2015-03-20 Thread Paul Hargrove
Ralph, The ssh message did turn out to be my fault. With it resolved I now get the TCP_KEEPALIVE warning 4 times instead of once, but the run proceeds just fine. So, this is not a failure - just an undesired warning. Sorry to have "cried wolf". -Paul On Fri, Mar 20, 2015 at 12:01 PM, Paul Hargr

Re: [OMPI devel] Master failure of oob_tcp on Solaris

2015-03-20 Thread Paul Hargrove
Ralph, Yes, it failed. Sorry, had meant to include more of the output than I did (see below). My Solaris systems moved (physically relocated the disks) yesterday between what *should* have been essentially identical hardware. At the moment I am looking into the ssh message, though I am sure I sh

Re: [OMPI devel] Master failure of oob_tcp on Solaris

2015-03-20 Thread Ralph Castain
Hi Paul It should have kept running, albeit with that warning - did the program actually fail? > On Mar 19, 2015, at 10:05 PM, Paul Hargrove wrote: > > Seen earlier today with last night's master tarball: > > $ mpirun -mca btl sm,self,verbs -np 2 -host pcp-j-31,pcp-j-35 examples/ring_c' > [p

[OMPI devel] Master failure of oob_tcp on Solaris

2015-03-20 Thread Paul Hargrove
Seen earlier today with last night's master tarball: $ mpirun -mca btl sm,self,verbs -np 2 -host pcp-j-31,pcp-j-35 examples/ring_c' [pcp-j-35:01400] [/shared/OMPI/openmpi-master-solaris11-x64-ib-ss12u3/openmpi-dev-1351-gccba8ce/orte/mca/oob/tcp/oob_tcp_common.c:103] setsockopt(TCP_KEEPALIVE) faile