[OMPI users] MPI orte_init fails on remote nodes

2012-02-13 Thread Richard Bardwell
Gentlemen I am struggling to get MPI working when the hostfile contains different nodes. I get the error below. Any ideas ?? I can ssh without password between the two nodes. I am running 1.2.8 MPI on both machines. Any help most appreciated ! MPITEST/v8_mpi_test> mpiexec -n 2 --debug-da

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-13 Thread Richard Bardwell
OK, I installed 1.4.4, rebuilt the exec and guess what .. I now get some weird errors as below: mca: base: component_find: unable to open /usr/local/lib/openmpi/mca_ras_dash_host along with a few other files even though the .so / .la files are all there ! - Original Message - Fro

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-13 Thread Richard Bardwell
: Re: [OMPI users] MPI orte_init fails on remote nodes You need to clean out the old attempt - that is a stale file Sent from my iPad On Feb 13, 2012, at 7:36 AM, "Richard Bardwell" wrote: OK, I installed 1.4.4, rebuilt the exec and guess what .. I now get some weird

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-13 Thread Richard Bardwell
My mistake Ralph, should have done a make uninstall instead ! Thanks Richard - Original Message - From: Ralph Castain To: Open MPI Users Sent: Monday, February 13, 2012 3:41 PM Subject: Re: [OMPI users] MPI orte_init fails on remote nodes You need to clean out the old att

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-13 Thread Richard Bardwell
Any ideas ?? Thanks Richard - Original Message - From: "Gustavo Correa" To: "Open MPI Users" Sent: Monday, February 13, 2012 4:22 PM Subject: Re: [OMPI users] MPI orte_init fails on remote nodes On Feb 13, 2012, at 11:02 AM, Richard Bardwell wrote: Ralph I had

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-14 Thread Richard Bardwell
fully uninstall the disto-installed version of Open MPI on all the nodes (e.g., Red Hat may have installed a different version of Open MPI, and that version is being found in your $PATH before your custom-installedversion). On Feb 13, 2012, at 12:12 PM, Richard Bardwell wrote: OK, 1.4.4 is happi

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-14 Thread Richard Bardwell
gins, for example. On Feb 14, 2012, at 5:40 AM, Richard Bardwell wrote: Jeff, I wiped out all versions of openmpi on all the nodes including the distro installed version. I reinstalled version 1.4.4 on all nodes. I now get the error that libopen-rte.so.0 cannot be found when running mpiexec acros

Re: [OMPI users] MPI orte_init fails on remote nodes

2012-02-14 Thread Richard Bardwell
nts, or conditional blocks that are only invoked during interactive logins, for example. On Feb 14, 2012, at 5:40 AM, Richard Bardwell wrote: Jeff, I wiped out all versions of openmpi on all the nodes including the distro installed version. I reinstalled version 1.4.4 on all nodes. I no

[OMPI users] MPI_Waitall strange behaviour on remote nodes

2012-02-14 Thread Richard Bardwell
In trying to debug an MPI_Waitall hang on a remote node, I created a simple code to test. If we run the simple code below on 2 nodes on a local machine, we send the number 1 and receive number 1 back. If we run the same code on a local node and a remote node, we send number 1 but get 32767 back.

Re: [OMPI users] Problem running an mpi applicatio​n on nodes with more than one interface

2012-02-17 Thread Richard Bardwell
I had exactly the same problem. Trying to run mpi between 2 separate machines, with each machine having 2 ethernet ports, causes really weird behaviour on the most basic code. I had to disable one of the ethernet ports on each of the machines and it worked just fine after that. No idea why though !

Re: [OMPI users] Problem running an mpi applicatio​n on nodes with more than one interface

2012-02-17 Thread Richard Bardwell
face Did you have both of the ethernet ports on the same subnet, or were they on different subnets? On Feb 17, 2012, at 5:36 AM, Richard Bardwell wrote: I had exactly the same problem. Trying to run mpi between 2 separate machines, with each machine having 2 ethernet ports, causes really we