This was run only locally.
Here at home i only have one computer.

But indeed i hadn't set the $LD_LIBRARY_PATH to /opt/openmpi/lib.
After i did so, the error didn't occur anymore with the short call.

It looks like the libraries to run with were from
  /usr/lib/openmpi/1.2.4-gcc/
Apparently i hadn't properly removed an old openmpi version which
had been put there by fedora...

Thanks!

Jody

On Fri, Feb 13, 2009 at 2:39 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
> What is your PATH / LD_LIBRARY_PATH when you rsh/ssh to other nodes?
>
> ssh othernode which mpirun
> ssh othernode env | grep PATH
>
>
> On Feb 13, 2009, at 5:11 AM, jody wrote:
>
>> Well all i do seems to verify that only one version is running:
>>
>> [jody@localhost 3D]$ ls -ld /opt/openmp*
>> lrwxrwxrwx 1 root root   26 2009-02-13 14:09 /opt/openmpi ->
>> /opt/openmpi-1.3.1a0r20534
>> drwxr-xr-x 7 root root 4096 2009-02-12 22:19 /opt/openmpi-1.3.1a0r20432
>> drwxr-xr-x 7 root root 4096 2009-02-12 21:58 /opt/openmpi-1.3.1a0r20520
>> drwxr-xr-x 7 root root 4096 2009-02-13 13:46 /opt/openmpi-1.3.1a0r20534
>> drwxr-xr-x 7 root root 4096 2009-02-12 22:41 /opt/openmpi-1.4a1r20525
>> [jody@localhost 3D]$ echo $PATH
>>
>> /opt/openmpi/bin:/opt/jdk/jdk1.6.0_07/bin:/opt/jdk/jdk1.6.0_07/bin:/opt/jdk/jdk1.6.0_07/bin:/usr/kerberos/bin:/usr/lib/ccache:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/jody/bin:/home/jody/utils
>> [jody@localhost 3D]$ which mpirun
>> /opt/openmpi/bin/mpirun
>> [jody@localhost 3D]$ mpirun --version
>> mpirun (Open MPI) 1.3.1a0r20534
>>
>> Report bugs to http://www.open-mpi.org/community/help/
>> [jody@localhost 3D]$ /opt/openmpi-1.3.1a0r20534/bin/mpirun --version
>> mpirun (Open MPI) 1.3.1a0r20534
>>
>> Report bugs to http://www.open-mpi.org/community/help/
>> [jody@localhost 3D]$
>>
>> BTW the same strange misbehaviour happen with the other versions
>>
>> Jody
>>
>>
>> On Fri, Feb 13, 2009 at 1:54 PM, jody <jody....@gmail.com> wrote:
>>>
>>> Forgot to add.
>>> i have /opt/openmpi/bin in my $PATH
>>>
>>> I tried around some more and found that it
>>> also works without errors if use
>>> /opt/openmpi/bin/mpirun -np 2 ./sr
>>>
>>> I don't understand this,  because 'mpirun' alone should be the same
>>> thing:
>>> [jody@localhost 3D]$ which mpirun
>>> /opt/openmpi/bin/mpirun
>>>
>>> Thank You for an explanation
>>>
>>> Jody
>>>
>>> On Fri, Feb 13, 2009 at 1:39 PM, jody <jody....@gmail.com> wrote:
>>>>
>>>> Yes, it was doing no sensible work -
>>>> It was only intended to show the error message.
>>>>
>>>> I now downloaded the latest nightly tarball and installed it,
>>>> and used your version of the test programm. It works -
>>>> *if* is use the entire path to mpirun:
>>>>
>>>> [jody@localhost 3D]$  /opt/openmpi-1.3.1a0r20534/bin/mpirun -np 2 ./sr
>>>>
>>>> but if i use the name alone, i get the error:
>>>>
>>>> [jody@localhost 3D]$ mpirun -np 2 ./sr
>>>> [localhost.localdomain:29285] *** An error occurred in MPI_Sendrecv
>>>> [localhost.localdomain:29285] *** on communicator MPI_COMM_WORLD
>>>> [localhost.localdomain:29285] *** MPI_ERR_RANK: invalid rank
>>>> [localhost.localdomain:29285] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>> [localhost.localdomain:29286] *** An error occurred in MPI_Sendrecv
>>>> [localhost.localdomain:29286] *** on communicator MPI_COMM_WORLD
>>>> [localhost.localdomain:29286] *** MPI_ERR_RANK: invalid rank
>>>> [localhost.localdomain:29286] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>
>>>> interestingly, it seems to be the same version:
>>>> [jody@localhost 3D]$ mpirun --version
>>>> mpirun (Open MPI) 1.3.1a0r20534
>>>>
>>>> i.e. the version is ok.
>>>>
>>>> I have my Open-MPI versions installed in directories
>>>> /opt/openmpi-1.xxx
>>>> and create a link
>>>> ln -s /opt/opnmpi-1.xxx /opt/openmpi
>>>> I do it like this so i can easily switch between different version
>>>>
>>>> Could the diffferent behavour of mpirun and
>>>> /opt/openmpi-1.3.1a0r20534/bin/mpirun
>>>> hab its cause in this setup?
>>>>
>>>> Thank You
>>>> Jody
>>>>
>>>> On Fri, Feb 13, 2009 at 1:18 AM, Jeff Squyres <jsquy...@cisco.com>
>>>> wrote:
>>>>>
>>>>> On Feb 12, 2009, at 2:00 PM, jody wrote:
>>>>>
>>>>>> In my application i use MPI_PROC_NULL
>>>>>> as an argument in MPI_Sendrecv to simplify the
>>>>>> program (i.e. no special cases for borders)
>>>>>> With 1.3 it works, but under 1.3.1a0r20520
>>>>>> i get the following error:
>>>>>> [jody@localhost 3D]$ mpirun -np 2 ./sr
>>>>>> [localhost.localdomain:29253] *** An error occurred in MPI_Sendrecv
>>>>>> [localhost.localdomain:29253] *** on communicator MPI_COMM_WORLD
>>>>>> [localhost.localdomain:29253] *** MPI_ERR_RANK: invalid rank
>>>>>> [localhost.localdomain:29253] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>>> [localhost.localdomain:29252] *** An error occurred in MPI_Sendrecv
>>>>>> [localhost.localdomain:29252] *** on communicator MPI_COMM_WORLD
>>>>>> [localhost.localdomain:29252] *** MPI_ERR_RANK: invalid rank
>>>>>> [localhost.localdomain:29252] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>>
>>>>> Your program as written should hang, right?  You're trying to receive
>>>>> from
>>>>> MCW rank 1 and no process is sending.
>>>>>
>>>>> I slightly modified your code:
>>>>>
>>>>> #include <stdio.h>
>>>>> #include "mpi.h"
>>>>>
>>>>> int main() {
>>>>>  int iRank;
>>>>>  int iSize;
>>>>>  MPI_Status st;
>>>>>
>>>>>  MPI_Init(NULL, NULL);
>>>>>  MPI_Comm_size(MPI_COMM_WORLD, &iSize);
>>>>>  MPI_Comm_rank(MPI_COMM_WORLD, &iRank);
>>>>>
>>>>>  if (1 == iRank) {
>>>>>      MPI_Send(&iSize, 1, MPI_INT, 0, 77, MPI_COMM_WORLD);
>>>>>  } else if (0 == iRank) {
>>>>>      MPI_Sendrecv(&iRank, 1, MPI_INT, MPI_PROC_NULL, 77,
>>>>>                   &iSize, 1, MPI_INT, 1, 77, MPI_COMM_WORLD, &st);
>>>>>  }
>>>>>
>>>>>  MPI_Finalize();
>>>>>  return 0;
>>>>> }
>>>>>
>>>>> And that works fine for me at the head of the v1.3 branch:
>>>>>
>>>>> [16:17] svbu-mpi:~/svn/ompi-1.3 % svnversion .
>>>>> 20538
>>>>>
>>>>> We did have a few bad commits on the v1.3 branch recently; could you
>>>>> try
>>>>> with a tarball from tonight, perchance?
>>>>>
>>>>> --
>>>>> Jeff Squyres
>>>>> Cisco Systems
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to