What is your PATH / LD_LIBRARY_PATH when you rsh/ssh to other nodes?

ssh othernode which mpirun
ssh othernode env | grep PATH


On Feb 13, 2009, at 5:11 AM, jody wrote:

Well all i do seems to verify that only one version is running:

[jody@localhost 3D]$ ls -ld /opt/openmp*
lrwxrwxrwx 1 root root   26 2009-02-13 14:09 /opt/openmpi ->
/opt/openmpi-1.3.1a0r20534
drwxr-xr-x 7 root root 4096 2009-02-12 22:19 /opt/ openmpi-1.3.1a0r20432 drwxr-xr-x 7 root root 4096 2009-02-12 21:58 /opt/ openmpi-1.3.1a0r20520 drwxr-xr-x 7 root root 4096 2009-02-13 13:46 /opt/ openmpi-1.3.1a0r20534
drwxr-xr-x 7 root root 4096 2009-02-12 22:41 /opt/openmpi-1.4a1r20525
[jody@localhost 3D]$ echo $PATH
/opt/openmpi/bin:/opt/jdk/jdk1.6.0_07/bin:/opt/jdk/jdk1.6.0_07/bin:/ opt/jdk/jdk1.6.0_07/bin:/usr/kerberos/bin:/usr/lib/ccache:/usr/local/ bin:/usr/bin:/bin:/usr/X11R6/bin:/home/jody/bin:/home/jody/utils
[jody@localhost 3D]$ which mpirun
/opt/openmpi/bin/mpirun
[jody@localhost 3D]$ mpirun --version
mpirun (Open MPI) 1.3.1a0r20534

Report bugs to http://www.open-mpi.org/community/help/
[jody@localhost 3D]$ /opt/openmpi-1.3.1a0r20534/bin/mpirun --version
mpirun (Open MPI) 1.3.1a0r20534

Report bugs to http://www.open-mpi.org/community/help/
[jody@localhost 3D]$

BTW the same strange misbehaviour happen with the other versions

Jody


On Fri, Feb 13, 2009 at 1:54 PM, jody <jody....@gmail.com> wrote:
Forgot to add.
i have /opt/openmpi/bin in my $PATH

I tried around some more and found that it
also works without errors if use
/opt/openmpi/bin/mpirun -np 2 ./sr

I don't understand this, because 'mpirun' alone should be the same thing:
[jody@localhost 3D]$ which mpirun
/opt/openmpi/bin/mpirun

Thank You for an explanation

Jody

On Fri, Feb 13, 2009 at 1:39 PM, jody <jody....@gmail.com> wrote:
Yes, it was doing no sensible work -
It was only intended to show the error message.

I now downloaded the latest nightly tarball and installed it,
and used your version of the test programm. It works -
*if* is use the entire path to mpirun:

[jody@localhost 3D]$ /opt/openmpi-1.3.1a0r20534/bin/mpirun -np 2 ./sr

but if i use the name alone, i get the error:

[jody@localhost 3D]$ mpirun -np 2 ./sr
[localhost.localdomain:29285] *** An error occurred in MPI_Sendrecv
[localhost.localdomain:29285] *** on communicator MPI_COMM_WORLD
[localhost.localdomain:29285] *** MPI_ERR_RANK: invalid rank
[localhost.localdomain:29285] *** MPI_ERRORS_ARE_FATAL (goodbye)
[localhost.localdomain:29286] *** An error occurred in MPI_Sendrecv
[localhost.localdomain:29286] *** on communicator MPI_COMM_WORLD
[localhost.localdomain:29286] *** MPI_ERR_RANK: invalid rank
[localhost.localdomain:29286] *** MPI_ERRORS_ARE_FATAL (goodbye)

interestingly, it seems to be the same version:
[jody@localhost 3D]$ mpirun --version
mpirun (Open MPI) 1.3.1a0r20534

i.e. the version is ok.

I have my Open-MPI versions installed in directories
/opt/openmpi-1.xxx
and create a link
ln -s /opt/opnmpi-1.xxx /opt/openmpi
I do it like this so i can easily switch between different version

Could the diffferent behavour of mpirun and
/opt/openmpi-1.3.1a0r20534/bin/mpirun
hab its cause in this setup?

Thank You
Jody

On Fri, Feb 13, 2009 at 1:18 AM, Jeff Squyres <jsquy...@cisco.com> wrote:
On Feb 12, 2009, at 2:00 PM, jody wrote:

In my application i use MPI_PROC_NULL
as an argument in MPI_Sendrecv to simplify the
program (i.e. no special cases for borders)
With 1.3 it works, but under 1.3.1a0r20520
i get the following error:
[jody@localhost 3D]$ mpirun -np 2 ./sr
[localhost.localdomain:29253] *** An error occurred in MPI_Sendrecv
[localhost.localdomain:29253] *** on communicator MPI_COMM_WORLD
[localhost.localdomain:29253] *** MPI_ERR_RANK: invalid rank
[localhost.localdomain:29253] *** MPI_ERRORS_ARE_FATAL (goodbye)
[localhost.localdomain:29252] *** An error occurred in MPI_Sendrecv
[localhost.localdomain:29252] *** on communicator MPI_COMM_WORLD
[localhost.localdomain:29252] *** MPI_ERR_RANK: invalid rank
[localhost.localdomain:29252] *** MPI_ERRORS_ARE_FATAL (goodbye)

Your program as written should hang, right? You're trying to receive from
MCW rank 1 and no process is sending.

I slightly modified your code:

#include <stdio.h>
#include "mpi.h"

int main() {
  int iRank;
  int iSize;
  MPI_Status st;

  MPI_Init(NULL, NULL);
  MPI_Comm_size(MPI_COMM_WORLD, &iSize);
  MPI_Comm_rank(MPI_COMM_WORLD, &iRank);

  if (1 == iRank) {
      MPI_Send(&iSize, 1, MPI_INT, 0, 77, MPI_COMM_WORLD);
  } else if (0 == iRank) {
      MPI_Sendrecv(&iRank, 1, MPI_INT, MPI_PROC_NULL, 77,
                   &iSize, 1, MPI_INT, 1, 77, MPI_COMM_WORLD, &st);
  }

  MPI_Finalize();
  return 0;
}

And that works fine for me at the head of the v1.3 branch:

[16:17] svbu-mpi:~/svn/ompi-1.3 % svnversion .
20538

We did have a few bad commits on the v1.3 branch recently; could you try
with a tarball from tonight, perchance?

--
Jeff Squyres
Cisco Systems

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems

Reply via email to