surprisingly,  it is all that I get!! nothing else come after.  This is the 
same for openmpi-1.6.5.


________________________________
From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
[r...@open-mpi.org]
Sent: 28 March 2015 20:12
To: Open MPI Users
Subject: Re: [OMPI users] Connection problem on Linux cluster

Did you configure —enable-debug? We aren’t seeing any of the debug output, so I 
suspect not.


On Mar 28, 2015, at 12:56 PM, LOTFIFAR F. 
<foad.lotfi...@durham.ac.uk<mailto:foad.lotfi...@durham.ac.uk>> wrote:

I have done it and it is the results:

ubuntu@fehg-node-0:~$ mpirun -host fehg-node-7 -mca oob_base_verbose 100 -mca 
state_base_verbose 10 hostname
[fehg-node-0:30034] mca: base: components_open: Looking for oob components
[fehg-node-0:30034] mca: base: components_open: opening oob components
[fehg-node-0:30034] mca: base: components_open: found loaded component tcp
[fehg-node-0:30034] mca: base: components_open: component tcp register function 
successful
[fehg-node-0:30034] mca: base: components_open: component tcp open function 
successful
[fehg-node-7:31138] mca: base: components_open: Looking for oob components
[fehg-node-7:31138] mca: base: components_open: opening oob components
[fehg-node-7:31138] mca: base: components_open: found loaded component tcp
[fehg-node-7:31138] mca: base: components_open: component tcp register function 
successful
[fehg-node-7:31138] mca: base: components_open: component tcp open function 
successful

freeze ...

Regards

________________________________
From: users [users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] on 
behalf of LOTFIFAR F. 
[foad.lotfi...@durham.ac.uk<mailto:foad.lotfi...@durham.ac.uk>]
Sent: 28 March 2015 18:49
To: Open MPI Users
Subject: Re: [OMPI users] Connection problem on Linux cluster

fehg_node_1 and fehg-node-7 are the same. it is just a typo.

Correction: VM names are fehg-node-0 and fehg-node-7.


Regards,

________________________________
From: users [users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] on 
behalf of Ralph Castain [r...@open-mpi.org<mailto:r...@open-mpi.org>]
Sent: 28 March 2015 18:23
To: Open MPI Users
Subject: Re: [OMPI users] Connection problem on Linux cluster

Just to be clear: do you have two physical nodes? Or just one physical node and 
you are running two VMs on it?

On Mar 28, 2015, at 10:51 AM, LOTFIFAR F. 
<foad.lotfi...@durham.ac.uk<mailto:foad.lotfi...@durham.ac.uk>> wrote:

I have a floating IP for accessing nodes from outside of the cluster and 
internal ip addresses. I tried to run the jobs with both of them (both ip 
addresses) but it makes no difference.
I have just installed openmpi 1.6.5 to see how does this version works. In this 
case I get nothing and I have to press Crtl+c. not output or error is shown.


________________________________
From: users [users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] on 
behalf of Ralph Castain [r...@open-mpi.org<mailto:r...@open-mpi.org>]
Sent: 28 March 2015 17:03
To: Open MPI Users
Subject: Re: [OMPI users] Connection problem on Linux cluster

You mentioned running this in a VM - is that IP address correct for getting 
across the VMs?


On Mar 28, 2015, at 8:38 AM, LOTFIFAR F. 
<foad.lotfi...@durham.ac.uk<mailto:foad.lotfi...@durham.ac.uk>> wrote:

Hi ,

I am wondering how can I solve this problem.
System Spec:
1- Linux cluster with two nodes (master and slave) with Ubuntu 12.04 LTS 32bit.
2- openmpi 1.8.4

I do a simple test running on fehg_node_0:
> mpirun -host fehg_node_0,fehg_node_1 hello_world -mca oob_base_verbose 20

and I get the following error:

A process or daemon was unable to complete a TCP connection
to another process:
  Local host:    fehg-node-0
  Remote host:   10.104.5.40
This is usually caused by a firewall on the remote host. Please
check that any firewall (e.g., iptables) has been disabled and
try again.
------------------------------------------------------------
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
  one or more nodes. Please check your PATH and LD_LIBRARY_PATH
  settings, or configure OMPI with --enable-orterun-prefix-by-default

* lack of authority to execute on one or more specified nodes.
  Please verify your allocation and authorities.

* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
  Please check with your sys admin to determine the correct location to use.

*  compilation of the orted with dynamic libraries when static are required
  (e.g., on Cray). Please check your configure cmd line and consider using
  one of the contrib/platform definitions for your system type.

* an inability to create a connection back to mpirun due to a
  lack of common network interfaces and/or no route found between
  them. Please check network connectivity (including firewalls
  and network routing requirements).

Verbose:
1- I have full access to the VMs on the cluster and setup everything myself
2- Firewall and iptables are all disabled on the nodes
3- nodes can ssh to each other with  no problem
4- non-interactive bash calls works fine i.e. when I run ssh othernode env | 
grep PATH from both nodes, both PATH and LD_LIBRARY_PATH are set correctly
5- I have checked the posts, a similar problem reported for Solaris but I could 
not find a clue about mine.
6- run with --enable-orterun-prefix-by-default does not make any changes.
7-  I see orte is running on the other node when I check processes, but nothing 
happens after that and the error happens.

Regards,
Karos
_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/03/26555.php

_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/03/26557.php

_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/03/26562.php

Reply via email to