Actually all machines use iptables as firewall.

I compared the rules triops and kraken use and found that triops had the
line
  REJECT     all  --  anywhere             anywhere             reject-with
icmp-host-prohibited
which kraken did not have (otherwise they were identical).
I removed that line from triops' rules, restarted iptables and now
communication works in all directions!

Thank You
  Jody

On Tue, May 3, 2016 at 7:00 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>
wrote:

> Have you disabled firewalls between these machines?
>
> > On May 3, 2016, at 11:26 AM, jody <jody....@gmail.com> wrote:
> >
> > ...my bad!
> >
> > I had set up things so that PATH and LD_LIBRARY_PATH were correct in
> interactive mode,
> > but they were wrong ssh was called non-interactively.
> >
> > Now i have a new problem:
> > When i do
> >   mpirun -np 6 --hostfile krakenhosts hostname
> > from triops, sometimes it seems to hang (i.e. no output, doesn't end)
> > and at other time i get the ouput
> > ----
> > [aim-kraken:24527] [[45056,0],1] tcp_peer_send_blocking: send() to
> socket 9 failed: Broken pipe (32)
> >
> --------------------------------------------------------------------------
> > ORTE was unable to reliably start one or more daemons.
> > This usually is caused by:
> > ...
> >
> --------------------------------------------------------------------------
> > -----
> > Again, i can call mpirun on triops from kraken und all squid_XX without
> a problem...
> >
> > What could cause this problem?
> >
> > Thank You
> >   Jody
> >
> >
> > On Tue, May 3, 2016 at 2:54 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > Have you verified that you are running the same version of Open MPI on
> both servers when launched from non-interactive logins?
> >
> > This kind of error is somewhat typical if you accidentally mixed, for
> example, Open MPI v1.6.x and v1.10.2 (i.e., v1.10.2 understands the
> --hnp-topo-sig back end option, but v1.6.x does not).
> >
> >
> > > On May 3, 2016, at 6:35 AM, jody <jody....@gmail.com> wrote:
> > >
> > > Hi
> > > I have installed Open MPI v 1.10.2 on two machines today using only
> the prefix-option for configure, and then doing 'make all install'.
> > >
> > > On both machines i changed .bashrc to set PATH and LD_LIBRARY_PATH
> correctly.
> > > (I checked by running 'mpirun --version' and verifying that the output
> does indeed say 1.10.2)
> > >
> > > Password-less ssh is enabled on both machines in both directions.
> > >
> > > When i start mpirun form one machine (kraken) with a hostfile
> specifying the other machine ("triops slots=8 max-slots=8),
> > > it works:
> > > -----
> > > jody@kraken ~ $ mpirun -np 3 --hostfile triopshosts uptime
> > >  12:24:04 up 7 days, 43 min, 17 users,  load average: 0.06, 0.68, 0.65
> > >  12:24:04 up 7 days, 43 min, 17 users,  load average: 0.06, 0.68, 0.65
> > >  12:24:04 up 7 days, 43 min, 17 users,  load average: 0.06, 0.68, 0.65
> > > -----
> > >
> > > But when i start mpirun form triops with a hostfile specifying kraken
> ("kraken slots=8 max-slots=8"),
> > > it fails:
> > > -----
> > > jody@triops ~ $ mpirun -np 3 --hostfile krakenhosts hostname
> > > [aim-kraken:21973] Error: unknown option "--hnp-topo-sig"
> > > input in flex scanner failed
> > >
> --------------------------------------------------------------------------
> > > ORTE was unable to reliably start one or more daemons.
> > > This usually is caused by:
> > >
> > > * not finding the required libraries and/or binaries on
> > >   one or more nodes. Please check your PATH and LD_LIBRARY_PATH
> > >   settings, or configure OMPI with --enable-orterun-prefix-by-default
> > >
> > > * lack of authority to execute on one or more specified nodes.
> > >   Please verify your allocation and authorities.
> > >
> > > * the inability to write startup files into /tmp
> (--tmpdir/orte_tmpdir_base).
> > >   Please check with your sys admin to determine the correct location
> to use.
> > >
> > > *  compilation of the orted with dynamic libraries when static are
> required
> > >   (e.g., on Cray). Please check your configure cmd line and consider
> using
> > >   one of the contrib/platform definitions for your system type.
> > >
> > > * an inability to create a connection back to mpirun due to a
> > >   lack of common network interfaces and/or no route found between
> > >   them. Please check network connectivity (including firewalls
> > >   and network routing requirements).
> > >
> --------------------------------------------------------------------------
> > >
> > > The same error happens when i use '--host kraken'.
> > >
> > > I verified that PATH and LD_LIBRARY_PATH are correctly set on both
> machines.
> > > And on both machines /tmp is readable, writeable and executable for
> all.
> > > The connection should be okay (i can do a ssh from kraken to triops
> and vice versa).
> > >
> > > Any idea what the problem is?
> > >
> > > Thank You
> > >   Jody
> > >
> > > _______________________________________________
> > > users mailing list
> > > us...@open-mpi.org
> > > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > > Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/05/29074.php
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/05/29075.php
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/05/29078.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/05/29079.php
>

Reply via email to