[OMPI users] Segmentation faults

2011-03-08 Thread arep isa
Hi,
I need to use Open MPI to distribute 2d-array in the PGM file among 10
working computers. Then I need to manipulate each value of the array
to get a negative image (255-i) and then print the output back. I'm
thinking of using mpi_scatterv and mpi_gatherv to distribute the data.
After i compile the program, it got segmentation faults. I dont know
what is the problem whether my code wrong or compiler. I integrate the
code to read/write pgm from pgm_RW_1.c and the MPI code in exmpi_2.c.

--I install OPEN MPI version 1.4.1-2 via Synaptic Package Manager on
UBUNTU 10.04.

--I compile with:
   mpicc -o exmpi_2 exmpi_2.c
--I run for testing (segmentation faults):
   mpirun -np 10 ./exmpi_2 2.pgm out.pgm
--Then I run with hostfile:
   mpirun -np 10 --hostfile .mpi_hostfile ./exmpi_2 2.pgm out.pgm


Here is the error:

arep@ubuntu:~/Desktop/fyp$ mpirun -np 10 ./exmpi_2 2.pgm out.pgm
[ubuntu:02948] *** Process received signal ***
[ubuntu:02948] Signal: Segmentation fault (11)
[ubuntu:02948] Signal code: Address not mapped (1)
[ubuntu:02948] Failing at address: (nil)
[ubuntu:02948] [ 0] [0x792410]
[ubuntu:02948] [ 1] ./exmpi_2(main+0x1f6) [0x8048d2a]
[ubuntu:02948] [ 2]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x126bd6]
[ubuntu:02948] [ 3] ./exmpi_2() [0x8048aa1]
[ubuntu:02948] *** End of error message ***
--
mpirun noticed that process rank 0 with PID 2948 on node ubuntu exited
on signal 11 (Segmentation fault).
--


Here is the input 2.pgm image :
http://orion.math.iastate.edu/burkardt/data/pgm/balloons.pgm

TQ for your help.


pgmmpi.tar.gz
Description: GNU Zip compressed data


[OMPI users] Two Instances of Same Process Rather Than Two Separate Processes

2011-03-08 Thread Clark Britan
I just installed OpenMPI on my Linux Ubuntu 10.04 LTS 64 bit computer. I
downloaded the most recent version of OpenMPI and ran the configure and make
commands. 

I then tried to run a CFD software called FDS using 2 of the 12 available
processors (single node) as a test. I split my computational domain into two
meshes, as explained in the FDS manual and would like to run each mesh on a
separate core. 

When I run the command mpirun -np 2 fds5_mpi_linux_64 room_fire.fds I get
the following error:

Process 0 of 0 is running on comp1
Process 0 of 0 is running on comp1
Mesh 1 is assigned to Process 0
Error: MPI_PROCESS greater than total number of processes

Why are two instances of the same process run instead of two separate
processes? What I expect to see after running the above command is:

Process 0 of 1 is running on comp1
Process 1 of 1 is running on comp1
Mesh 1 is assigned to Process 0
Mesh 2 is assigned to Process 1
...

Any idea what is going on? Thanks for the help.

Kind Regards,

Clark



Re: [OMPI users] Number of processes and spawn

2011-03-08 Thread Federico Golfrè Andreasi
Hi Ralph,

I've did some more tests, hope this can help.



*Using OpenMPI-1.5*

- The program works correctly doing a multiple spawn up to 128 cpus.
- When spawning to more than 129 cpus it hungs during the spawn.
  I've discovered that just before the spwaning all the processes liying on
one node goes down.
  I've tryed to eliminate from the hostfile that nodes but the same
behaviour is done by another nodes.

I've attached the output log files.



*Using OpenMPI-1.7a1r24472*

- The program works correcly with more than 128 cpus.
- Sometimes (not with the same number of process) after the program ends (it
prints  THE SLAVE END ) the orted deamon is not released.
  All the master-slave program are not on the top, but I can found
an mpiexec process in the laucing node and 1 orted process in all compute
nodes.
- Sometimes (not with the same number of precess) during the spawn it prints
a warning message of ORTE_ERROR_LOG (I've attached also this file).



Let me know if I can do some more tests that can helps,
or if I've to check some environment settings or hardware.

Thank you,
Federico.








Il giorno 07 marzo 2011 15:24, Ralph Castain  ha scritto:

>
> On Mar 7, 2011, at 3:24 AM, Federico Golfrè Andreasi wrote:
>
> Hi Ralph,
>
> thank you very much for the detailed response.
>
> I have to apologize I was not clear: I would like to use the
> MPI_spawn_multiple function.
>
>
> Shouldn't matter - it's the same code path.
>
> (I've attached the example program I use) .
>
>
> I'm rebuilding for C++ as I don't typically use that language - will report
> back later.
>
>
> In any case I tryed your test program, just compling it with:
> /home/fandreasi/openmpi-1.7/bin/mpicc loop_spawn.c -o loop_spawn
> /home/fandreasi/openmpi-1.7/bin/mpicc loop_child.c -o loop_child
> and execute it on a single machine with
> /home/fandreasi/openmpi-1.7/bin/mpiexec ./loop_spawn ./loop_child
>
>
> I should have been clearer - this is not the correct way to run the
> program. The correct way is:
>
> mpiexec -n 1 ./loop_spawn
>
> loop_child is just the executable being comm_spawn'd.
>
> but it hungs at different loop iterations after printing:
> "Child 26833:exiting"
> but looking at the top both the process (loop_spawn and loop_child) are
> still alive.
>
> I'm starting thinking that I've some environment setting not correct or I
> need to compile OpenMPI with some options.
> I compile it just setting the --prefix option to the ./configure.
> Do I need to do something else ?
>
>
> No, that should work.
>
>
> I have a linux Centos 4, 64 bits machine,
> with gcc 3.4.
>
> I think that this is my main problem now.
>
>
>
> Just to answer to other topics (minor):
> - Regardin version mismatch I use a linux cluster where the /home/
> directory is shared among the compute nodes,
> and I've edited by .bashrc and .bashprofile to export the correct
> LD_LIBRARY_PATH.
> - thank you for the usefull trick about svn.
>
>
> No idea, then - all that error says is that the receiving code and the
> sending code are mismatched.
>
>
>
> Thank you very much !!!
> Federico.
>
>
>
>
>
>
> Il giorno 05 marzo 2011 19:05, Ralph Castain  ha
> scritto:
>
>> Hi Federico
>>
>> I tested the trunk today and it works fine for me - I let it spin for 1000
>> cycles without issue. My test program is essentially identical to what you
>> describe - you can see it in the orte/test/mpi directory. The "master" is
>> loop_spawn.c, and the "slave" is loop_child.c. I only tested it on a single
>> machine, though - will have to test multi-machine later. You might see if
>> that makes a difference.
>>
>> The error you report in your attachment is a classic symptom of mismatched
>> versions. Remember, we don't forward your ld_lib_path, so it has to be
>> correct on your remote machine.
>>
>> As for r22794 - we don't keep anything that old on our web site. If you
>> want to build it, the best way to get the code is to do a subversion
>> checkout of the developer's trunk at that revision level:
>>
>> svn co -r 22794 http://svn.open-mpi.org/svn/ompi/trunk
>>
>> Remember to run autogen before configure.
>>
>>
>> On Mar 4, 2011, at 4:43 AM, Federico Golfrè Andreasi wrote:
>>
>>
>> Hi Ralph,
>>
>> I'm getting stuck with spawning stuff,
>>
>> I've downloaded the snapshot from the trunk of 1st of March (
>> openmpi-1.7a1r24472.tar.bz2),
>> I'm testing using a small program that does the following:
>>  - master program starts and each rank prints his hostsname
>>  - master program spawn a slave program with the same size
>>  - each rank of the slave (spawned) program prints his hostname
>>  - end
>> Not always he is able to complete the progam run, two different behaviour:
>>  1. not all the slave print their hostname and the program ends suddenly
>>  2. both program ends correctly but orted demon is still alive and I need
>> to press crtl-c to exit
>>
>>
>> I've tryed to recompile my test program with a previous snapshot
>> (openmpi-1.7a1r22794.tar.bz2)
>> where I have only 

Re: [OMPI users] Two Instances of Same Process Rather Than Two SeparateProcesses

2011-03-08 Thread Jeff Squyres (jsquyres)
This usually indicates a mismatch of MPI installations - eg, you compiled 
against one MPI installation but then accidentally used the mpirun from a 
different MPI installation. 

Sent from my phone. No type good. 

On Mar 8, 2011, at 4:36 AM, "Clark Britan"  wrote:

> I just installed OpenMPI on my Linux Ubuntu 10.04 LTS 64 bit computer. I
> downloaded the most recent version of OpenMPI and ran the configure and make
> commands. 
> 
> I then tried to run a CFD software called FDS using 2 of the 12 available
> processors (single node) as a test. I split my computational domain into two
> meshes, as explained in the FDS manual and would like to run each mesh on a
> separate core. 
> 
> When I run the command mpirun -np 2 fds5_mpi_linux_64 room_fire.fds I get
> the following error:
> 
> Process 0 of 0 is running on comp1
> Process 0 of 0 is running on comp1
> Mesh 1 is assigned to Process 0
> Error: MPI_PROCESS greater than total number of processes
> 
> Why are two instances of the same process run instead of two separate
> processes? What I expect to see after running the above command is:
> 
> Process 0 of 1 is running on comp1
> Process 1 of 1 is running on comp1
> Mesh 1 is assigned to Process 0
> Mesh 2 is assigned to Process 1
> ...
> 
> Any idea what is going on? Thanks for the help.
> 
> Kind Regards,
> 
> Clark
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Two Instances of Same Process Rather Than Two SeparateProcesses

2011-03-08 Thread Gus Correa

Jeff Squyres (jsquyres) wrote:
This usually indicates a mismatch of MPI installations - eg, you compiled against one MPI installation but then accidentally used the mpirun from a different MPI installation. 

Sent from my phone. No type good. 


On Mar 8, 2011, at 4:36 AM, "Clark Britan"  wrote:


I just installed OpenMPI on my Linux Ubuntu 10.04 LTS 64 bit computer. I
downloaded the most recent version of OpenMPI and ran the configure and make
commands. 


I then tried to run a CFD software called FDS using 2 of the 12 available
processors (single node) as a test. I split my computational domain into two
meshes, as explained in the FDS manual and would like to run each mesh on a
separate core. 


When I run the command mpirun -np 2 fds5_mpi_linux_64 room_fire.fds I get
the following error:

Process 0 of 0 is running on comp1
Process 0 of 0 is running on comp1
Mesh 1 is assigned to Process 0
Error: MPI_PROCESS greater than total number of processes

Why are two instances of the same process run instead of two separate
processes? What I expect to see after running the above command is:

Process 0 of 1 is running on comp1
Process 1 of 1 is running on comp1
Mesh 1 is assigned to Process 0
Mesh 2 is assigned to Process 1
...

Any idea what is going on? Thanks for the help.

Kind Regards,

Clark


Hi Clark

Any chances that MPI_PROCESS was not properly set in your FDS parameter 
file?

I am not familiar to the FDS software, but it looks like MPI_PROCESS is
part of the FDS setup, and the error message seems to complain
of a mismatch w.r.t. the number of processes (-np 2).
Maybe it takes a default value.

Also, if you just want to check your OpenMPI functionality, download
the OpenMPI source code, compile (with mpicc) and run (with mpirun)
the hello_c.c, connectivity_c.c, and ring_c.c programs in the 'examples'
directory.  This will at least tell you if the problem is in OpenMPI or
in FDS.

My two cents,
Gus Correa


[OMPI users] multi-threaded programming

2011-03-08 Thread Eugene Loh
Let's say you have multi-threaded MPI processes, you request 
MPI_THREAD_MULTIPLE and get MPI_THREAD_MULTIPLE, and you use the 
self,sm,tcp BTLs (which have some degree of threading support).  Is it 
okay to have an [MPI_Isend|MPI_Irecv] on one thread be completed by an 
MPI_Wait on another thread?  I'm assuming some sort of synchronization 
and memory barrier/flush in between to protect against funny race 
conditions.


If it makes things any easier on you, we can do this multiple-choice style:

1)  Forbidden by the MPI standard.
2)  Not forbidden by the MPI standard, but will not work with OMPI (not 
even with the BTLs that claim to be multi-threaded).

3)  Works well with OMPI (provided you use a BTL that's multi-threaded).

It's looking like #2 to me, but I'm not sure.


Re: [OMPI users] multi-threaded programming

2011-03-08 Thread Durga Choudhury
A follow-up question (and pardon if this sounds stupid) is this:

If I want to make my process multithreaded, BUT only one thread has
anything to do with MPI (for example, using OpenMP inside MPI), then
the results will be correct EVEN IF #1 or #2 of Eugene holds true. Is
this correct?

Thanks
Durga

On Tue, Mar 8, 2011 at 12:34 PM, Eugene Loh  wrote:
> Let's say you have multi-threaded MPI processes, you request
> MPI_THREAD_MULTIPLE and get MPI_THREAD_MULTIPLE, and you use the self,sm,tcp
> BTLs (which have some degree of threading support).  Is it okay to have an
> [MPI_Isend|MPI_Irecv] on one thread be completed by an MPI_Wait on another
> thread?  I'm assuming some sort of synchronization and memory barrier/flush
> in between to protect against funny race conditions.
>
> If it makes things any easier on you, we can do this multiple-choice style:
>
> 1)  Forbidden by the MPI standard.
> 2)  Not forbidden by the MPI standard, but will not work with OMPI (not even
> with the BTLs that claim to be multi-threaded).
> 3)  Works well with OMPI (provided you use a BTL that's multi-threaded).
>
> It's looking like #2 to me, but I'm not sure.
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] multi-threaded programming

2011-03-08 Thread Eugene Loh




Durga Choudhury wrote:

  A follow-up question (and pardon if this sounds stupid) is this:

If I want to make my process multithreaded, BUT only one thread has
anything to do with MPI (for example, using OpenMP inside MPI), then
the results will be correct EVEN IF #1 or #2 of Eugene holds true. Is
this correct?
  


I believe this is thoroughly covered by the standard (though I suppose
the same could have been said about my question).

In any case, for your situation, initialize MPI with
MPI_Init_thread().  Ask for thread level MPI_THREAD_FUNNELED and check
that that level is provided.  That should cover your case.  See the man
page for MPI_Init_thread().  My question should not have anything to do
with your case.

  On Tue, Mar 8, 2011 at 12:34 PM, Eugene Loh  wrote:
  
  
Let's say you have multi-threaded MPI processes, you request
MPI_THREAD_MULTIPLE and get MPI_THREAD_MULTIPLE, and you use the self,sm,tcp
BTLs (which have some degree of threading support).  Is it okay to have an
[MPI_Isend|MPI_Irecv] on one thread be completed by an MPI_Wait on another
thread?  I'm assuming some sort of synchronization and memory barrier/flush
in between to protect against funny race conditions.

If it makes things any easier on you, we can do this multiple-choice style:

1)  Forbidden by the MPI standard.
2)  Not forbidden by the MPI standard, but will not work with OMPI (not even
with the BTLs that claim to be multi-threaded).
3)  Works well with OMPI (provided you use a BTL that's multi-threaded).

It's looking like #2 to me, but I'm not sure.
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


  
  
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
  






Re: [OMPI users] MPI_ALLREDUCE bug with 1.5.2rc3r24441

2011-03-08 Thread Jeff Squyres
Try as I might, I cannot reproduce this error.  :-(

I only have the intel compiler version 11.x, though -- not 12.

Can you change your test to use MPI_Reduce_local with INTEGER8 and see if the 
problem still occurs?  (it probably will, but it is a significantly simpler 
code path to get down to the INTEGER8 SUM operation back-end)

If so, can you attach a debugger and see why Open MPI thinks it doesn't have an 
op for the (INTEGER8, SUM) combination?  

I'm sorry -- this is the best that I can offer since I can't reproduce the 
problem myself.  :-(


On Mar 3, 2011, at 3:35 PM, Harald Anlauf wrote:

> Please find attached the output of:
> 
> configure
> make all
> make install
> ompi_info -all
> mpif90 -v mpiallreducetest.f90
> ldd a.out
> ./a.out
> 
> System: OpenSuse Linux 11.1 on Core2Duo, i686
> 
> Compiler is:
> Intel(R) Fortran Compiler XE for applications running on IA-32, Version
> 12.0.1.107 Build 20101116
> 
> (The problem is the same with gfortran 4.6 (prerelease).)
> 
> Harald
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] Problem running openmpi-1.4.3

2011-03-08 Thread Amos Leffler
Hi,
I am trying to get openmpi-1.4.3 to run but am having trouble.
 It is run using SUSE-11.3 with Intel XE-2011 Composer C and Fortran
compilers.  The compilers installed without problems.  The openmpi
file was downloaded and unzipped and untarred.  The ./configure
command was run and it was found to be necessary to set CC=gcc and
CXX=g++.  The fortran F77 and F90 were set to ifort. The --prefix was
set to /usr.  The program appeared to compile properly but none of the
examples given would not compile.  The error messages are shown below:

linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpicc
hello_c.c =o hello_c
mpicc: error while loading shared libraries: libopen-pal.so.0: cannot
open shared object file: No such file or directory
linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpiCC
hello_cxx.cc -o hello_cxx
mpiCC: error while loading shared libraries: libopen-pal.so.0: cannot
open shared object file: No such file or directory
linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpif77
hello_f77.f -o hello_f77
mpif77: error while loading shared libraries: libopen-pal.so.0: cannot
open shared object file: No such file or directory
linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpif90
hello_f90.f90 -o hello_f90
mpif90: error while loading shared libraries: libopen-pal.so.0: cannot
open shared object file: No such file or directory
linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples

It is evident that the same error is present in all attempts to
compile but I don't know why it is absent.  Any help would be much
appreciated.


Amos Leffler


Re: [OMPI users] Problem running openmpi-1.4.3

2011-03-08 Thread David Zhang
you need to set your LD_LIBRARY_PATH to contain the MPI libraries.  The more
experienced MPI users in this mailing list would tell you what to include.

On Tue, Mar 8, 2011 at 4:47 PM, Amos Leffler  wrote:

> Hi,
>I am trying to get openmpi-1.4.3 to run but am having trouble.
>  It is run using SUSE-11.3 with Intel XE-2011 Composer C and Fortran
> compilers.  The compilers installed without problems.  The openmpi
> file was downloaded and unzipped and untarred.  The ./configure
> command was run and it was found to be necessary to set CC=gcc and
> CXX=g++.  The fortran F77 and F90 were set to ifort. The --prefix was
> set to /usr.  The program appeared to compile properly but none of the
> examples given would not compile.  The error messages are shown below:
>
> linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpicc
> hello_c.c =o hello_c
> mpicc: error while loading shared libraries: libopen-pal.so.0: cannot
> open shared object file: No such file or directory
> linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpiCC
> hello_cxx.cc -o hello_cxx
> mpiCC: error while loading shared libraries: libopen-pal.so.0: cannot
> open shared object file: No such file or directory
> linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpif77
> hello_f77.f -o hello_f77
> mpif77: error while loading shared libraries: libopen-pal.so.0: cannot
> open shared object file: No such file or directory
> linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpif90
> hello_f90.f90 -o hello_f90
> mpif90: error while loading shared libraries: libopen-pal.so.0: cannot
> open shared object file: No such file or directory
> linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples
>
> It is evident that the same error is present in all attempts to
> compile but I don't know why it is absent.  Any help would be much
> appreciated.
>
>
>Amos Leffler
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
David Zhang
University of California, San Diego


Re: [OMPI users] Problem running openmpi-1.4.3

2011-03-08 Thread Ralph Castain
You need to set your LD_LIBRARY_PATH to point to where you installed openmpi.


On Mar 8, 2011, at 5:47 PM, Amos Leffler wrote:

> Hi,
>I am trying to get openmpi-1.4.3 to run but am having trouble.
> It is run using SUSE-11.3 with Intel XE-2011 Composer C and Fortran
> compilers.  The compilers installed without problems.  The openmpi
> file was downloaded and unzipped and untarred.  The ./configure
> command was run and it was found to be necessary to set CC=gcc and
> CXX=g++.  The fortran F77 and F90 were set to ifort. The --prefix was
> set to /usr.  The program appeared to compile properly but none of the
> examples given would not compile.  The error messages are shown below:
> 
> linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpicc
> hello_c.c =o hello_c
> mpicc: error while loading shared libraries: libopen-pal.so.0: cannot
> open shared object file: No such file or directory
> linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpiCC
> hello_cxx.cc -o hello_cxx
> mpiCC: error while loading shared libraries: libopen-pal.so.0: cannot
> open shared object file: No such file or directory
> linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpif77
> hello_f77.f -o hello_f77
> mpif77: error while loading shared libraries: libopen-pal.so.0: cannot
> open shared object file: No such file or directory
> linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpif90
> hello_f90.f90 -o hello_f90
> mpif90: error while loading shared libraries: libopen-pal.so.0: cannot
> open shared object file: No such file or directory
> linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples
> 
> It is evident that the same error is present in all attempts to
> compile but I don't know why it is absent.  Any help would be much
> appreciated.
> 
> 
>Amos Leffler
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Problem running openmpi-1.4.3

2011-03-08 Thread Gus Correa

Ralph Castain wrote:

You need to set your LD_LIBRARY_PATH to point to where you installed openmpi.


On Mar 8, 2011, at 5:47 PM, Amos Leffler wrote:


Hi,
   I am trying to get openmpi-1.4.3 to run but am having trouble.
It is run using SUSE-11.3 with Intel XE-2011 Composer C and Fortran
compilers.  The compilers installed without problems.  The openmpi
file was downloaded and unzipped and untarred.  The ./configure
command was run and it was found to be necessary to set CC=gcc and
CXX=g++.  The fortran F77 and F90 were set to ifort. The --prefix was
set to /usr.  The program appeared to compile properly but none of the
examples given would not compile.  The error messages are shown below:

linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpicc
hello_c.c =o hello_c
mpicc: error while loading shared libraries: libopen-pal.so.0: cannot
open shared object file: No such file or directory
linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpiCC
hello_cxx.cc -o hello_cxx
mpiCC: error while loading shared libraries: libopen-pal.so.0: cannot
open shared object file: No such file or directory
linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpif77
hello_f77.f -o hello_f77
mpif77: error while loading shared libraries: libopen-pal.so.0: cannot
open shared object file: No such file or directory
linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples # mpif90
hello_f90.f90 -o hello_f90
mpif90: error while loading shared libraries: libopen-pal.so.0: cannot
open shared object file: No such file or directory
linux-q2bz:/home/amosleffler/Downloads/openmpi-1.4.3/examples

It is evident that the same error is present in all attempts to
compile but I don't know why it is absent.  Any help would be much
appreciated.


   Amos Leffler
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


These FAQ detail what David and Ralph said:

http://www.open-mpi.org/faq/?category=running#run-prereqs
http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path

You can prepend or append (don't overwrite)
the OpenMPI library directory to your current LD_LIBRARY_PATH,
so as to keep he Intel ifort library path there also.

My two cents,
Gus Correa




[OMPI users] problems with establishing an intercommunicator

2011-03-08 Thread Waclaw Kusnierczyk

Hello,

I'm trying to connect two independent MPI process groups with an 
intercommunicator, using ports, as described in sec. 10.4 of the MPI 
standard.  One group runs a server, the other a client.  The server 
opens a port, publishes the port's name, and waits for a connection.  
The client obtains the port's name, and connects to it.  The problem is, 
the code works if both the server and the client are run in a 
one-process MPI group each.  If any of the MPI groups has more than one 
process, the program hangs.


The following are two fragments of a minimal code example reproducing 
the problem on my machine.  The server:


if (rank == 0) {
MPI_Open_port(MPI_INFO_NULL, port);
int fifo = open(argv[1], O_WRONLY);
write(fifo, port, MPI_MAX_PORT_NAME);
close(fifo);
printf("[server] listening on port '%s'\n", port);
MPI_Comm_accept(port, MPI_INFO_NULL, 0, this, &that);
printf("[server] connected\n");
MPI_Close_port(port); }
MPI_Barrier(this);

and the client:

if (rank == 0) {
int fifo = open(buffer, O_RDONLY);
read(fifo, port, MPI_MAX_PORT_NAME);
close(fifo);
printf("[client] connecting to port '%s'\n", port);
MPI_Comm_connect(port, MPI_INFO_NULL, 0, this, &that);
printf("[client] connected\n"); }
MPI_Barrier(this);

where 'this' is the local MPI_COMM_WORLD, and the port name is 
transmitted via a named pipe.  (Complete code together with a makefile 
is attached for reference.)


When the compiled codes are run on one MPI process each:

mkfifo port
mpirun -np 1 ./server port &
mpirun -np 1 ./client port

the connection is established as expected.  With more than one process 
on either side, however, the execution blocks at the connect-accept step 
(i.e., after the 'listening' and 'connecting' messages are printed, but 
before the 'connected' messages are); using the attached code,


make NS=2 run

or

make NC=2 run

should reproduce the problem.

I'm using OpenMPI on two different machines: 1.4 on a 2-core laptop, and 
1.3.3 on a large supercomputer, having the same problem on both.  Where 
do I go wrong?


One more, related question:  once I manage to establish an 
intercommunicator for two multi-process MPI groups, can any process in 
one group send a message to any process in the other, directly, or does 
the communication have to go through the root nodes?


Regards,
Wacek



rendezvous.tgz
Description: application/compressed-tar