Something is clearly wrong. Most likely, you are not pointing at the OMPI 
install that you think you are - or you didn’t really configure it properly. 
Check the path by running “which mpirun” and ensure you are executing the one 
you expected. If so, then run “ompi_info” to see how it was configured and sent 
it to us.


> On Mar 28, 2015, at 1:36 PM, LOTFIFAR F. <foad.lotfi...@durham.ac.uk> wrote:
> 
> surprisingly,  it is all that I get!! nothing else come after.  This is the 
> same for openmpi-1.6.5.
> 
> 
> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
> [r...@open-mpi.org]
> Sent: 28 March 2015 20:12
> To: Open MPI Users
> Subject: Re: [OMPI users] Connection problem on Linux cluster
> 
> Did you configure —enable-debug? We aren’t seeing any of the debug output, so 
> I suspect not.
> 
> 
>> On Mar 28, 2015, at 12:56 PM, LOTFIFAR F. <foad.lotfi...@durham.ac.uk 
>> <mailto:foad.lotfi...@durham.ac.uk>> wrote:
>> 
>> I have done it and it is the results:
>> 
>> ubuntu@fehg-node-0:~$ mpirun -host fehg-node-7 -mca oob_base_verbose 100 
>> -mca state_base_verbose 10 hostname
>> [fehg-node-0:30034] mca: base: components_open: Looking for oob components
>> [fehg-node-0:30034] mca: base: components_open: opening oob components
>> [fehg-node-0:30034] mca: base: components_open: found loaded component tcp
>> [fehg-node-0:30034] mca: base: components_open: component tcp register 
>> function successful
>> [fehg-node-0:30034] mca: base: components_open: component tcp open function 
>> successful
>> [fehg-node-7:31138] mca: base: components_open: Looking for oob components
>> [fehg-node-7:31138] mca: base: components_open: opening oob components
>> [fehg-node-7:31138] mca: base: components_open: found loaded component tcp
>> [fehg-node-7:31138] mca: base: components_open: component tcp register 
>> function successful
>> [fehg-node-7:31138] mca: base: components_open: component tcp open function 
>> successful
>> 
>> freeze ...
>> 
>> Regards
>> 
>> From: users [users-boun...@open-mpi.org <mailto:users-boun...@open-mpi.org>] 
>> on behalf of LOTFIFAR F. [foad.lotfi...@durham.ac.uk 
>> <mailto:foad.lotfi...@durham.ac.uk>]
>> Sent: 28 March 2015 18:49
>> To: Open MPI Users
>> Subject: Re: [OMPI users] Connection problem on Linux cluster
>> 
>> fehg_node_1 and fehg-node-7 are the same. it is just a typo. 
>> 
>> Correction: VM names are fehg-node-0 and fehg-node-7.
>> 
>> 
>> Regards,
>> 
>> From: users [users-boun...@open-mpi.org <mailto:users-boun...@open-mpi.org>] 
>> on behalf of Ralph Castain [r...@open-mpi.org <mailto:r...@open-mpi.org>]
>> Sent: 28 March 2015 18:23
>> To: Open MPI Users
>> Subject: Re: [OMPI users] Connection problem on Linux cluster
>> 
>> Just to be clear: do you have two physical nodes? Or just one physical node 
>> and you are running two VMs on it?
>> 
>>> On Mar 28, 2015, at 10:51 AM, LOTFIFAR F. <foad.lotfi...@durham.ac.uk 
>>> <mailto:foad.lotfi...@durham.ac.uk>> wrote:
>>> 
>>> I have a floating IP for accessing nodes from outside of the cluster and 
>>> internal ip addresses. I tried to run the jobs with both of them (both ip 
>>> addresses) but it makes no difference. 
>>> I have just installed openmpi 1.6.5 to see how does this version works. In 
>>> this case I get nothing and I have to press Crtl+c. not output or error is 
>>> shown.
>>> 
>>> 
>>> From: users [users-boun...@open-mpi.org 
>>> <mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain 
>>> [r...@open-mpi.org <mailto:r...@open-mpi.org>]
>>> Sent: 28 March 2015 17:03
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] Connection problem on Linux cluster
>>> 
>>> You mentioned running this in a VM - is that IP address correct for getting 
>>> across the VMs?
>>> 
>>> 
>>>> On Mar 28, 2015, at 8:38 AM, LOTFIFAR F. <foad.lotfi...@durham.ac.uk 
>>>> <mailto:foad.lotfi...@durham.ac.uk>> wrote:
>>>> 
>>>> Hi , 
>>>> 
>>>> I am wondering how can I solve this problem. 
>>>> System Spec:
>>>> 1- Linux cluster with two nodes (master and slave) with Ubuntu 12.04 LTS 
>>>> 32bit.
>>>> 2- openmpi 1.8.4
>>>> 
>>>> I do a simple test running on fehg_node_0:
>>>> > mpirun -host fehg_node_0,fehg_node_1 hello_world -mca oob_base_verbose 20
>>>> 
>>>> and I get the following error:
>>>> 
>>>> A process or daemon was unable to complete a TCP connection
>>>> to another process:
>>>>   Local host:    fehg-node-0
>>>>   Remote host:   10.104.5.40
>>>> This is usually caused by a firewall on the remote host. Please
>>>> check that any firewall (e.g., iptables) has been disabled and
>>>> try again.
>>>> ------------------------------------------------------------
>>>> --------------------------------------------------------------------------
>>>> ORTE was unable to reliably start one or more daemons.
>>>> This usually is caused by:
>>>> 
>>>> * not finding the required libraries and/or binaries on
>>>>   one or more nodes. Please check your PATH and LD_LIBRARY_PATH
>>>>   settings, or configure OMPI with --enable-orterun-prefix-by-default
>>>> 
>>>> * lack of authority to execute on one or more specified nodes.
>>>>   Please verify your allocation and authorities.
>>>> 
>>>> * the inability to write startup files into /tmp 
>>>> (--tmpdir/orte_tmpdir_base).
>>>>   Please check with your sys admin to determine the correct location to 
>>>> use.
>>>> 
>>>> *  compilation of the orted with dynamic libraries when static are required
>>>>   (e.g., on Cray). Please check your configure cmd line and consider using
>>>>   one of the contrib/platform definitions for your system type.
>>>> 
>>>> * an inability to create a connection back to mpirun due to a
>>>>   lack of common network interfaces and/or no route found between
>>>>   them. Please check network connectivity (including firewalls
>>>>   and network routing requirements).
>>>> 
>>>> Verbose:
>>>> 1- I have full access to the VMs on the cluster and setup everything myself
>>>> 2- Firewall and iptables are all disabled on the nodes
>>>> 3- nodes can ssh to each other with  no problem
>>>> 4- non-interactive bash calls works fine i.e. when I run ssh othernode env 
>>>> | grep PATH from both nodes, both PATH and LD_LIBRARY_PATH are set 
>>>> correctly
>>>> 5- I have checked the posts, a similar problem reported for Solaris but I 
>>>> could not find a clue about mine. 
>>>> 6- run with --enable-orterun-prefix-by-default does not make any changes.
>>>> 7-  I see orte is running on the other node when I check processes, but 
>>>> nothing happens after that and the error happens.
>>>> 
>>>> Regards,
>>>> Karos
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2015/03/26555.php 
>>>> <http://www.open-mpi.org/community/lists/users/2015/03/26555.php>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2015/03/26557.php 
>>> <http://www.open-mpi.org/community/lists/users/2015/03/26557.php>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2015/03/26562.php 
>> <http://www.open-mpi.org/community/lists/users/2015/03/26562.php>
> _______________________________________________
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/03/26564.php 
> <http://www.open-mpi.org/community/lists/users/2015/03/26564.php>

Reply via email to