Yep, I use ldd every days. But here the problem comes from a corrupted structure in MorphMPI and MPI

typedef struct{
 int MorphMPI_SOURCE;
 int MorphMPI_TAG;
 int MorphMPI_ERROR;
 void* mpi_status ;
} MorphMPI_Status ;

Where the attribut mpi_status is used to point a real MPI_Status. In MPICH:

typedef struct{
 int MPI_SOURCE;
 int MPI_TAG;
 int MPI_ERROR;
 int count ;
} MPI_Status ;

Then, when my MorphMPI_Status is given to MorphMPI_Get_count(), the attribut MorphMPI_Status::mpi_status is not corrupted but MorphMPI_Status::mpi_status::count is corrupted: the value should be 4 and not "random".

I tried to manipulate the structure MorphMPI_Status (add another integer to align it in 64-bits, only have the void*,...) without success.

As reminder, this problem appears only when the MPI is used through a dynamic linked MorphMPI library.

Does someone have an idea?

Mathieu Gontier
Core Development Engineer

Read the attached v-card for telephone, fax, adress
Look at our web-site http://www.fft.be




Joe Landman wrote:
Greetings Mathieu:

Mathieu Gontier wrote:

[...]

So, I meet a little problem whatever the MPI library used (I tried with MPICH-1.2.5.2, MPICHGM and IntelMPI). When MorphMPI is linked statically with my parallel application, everything is ok; but when MorphMPI is linked dynamically with my parallel application, MPI_Get_count return a wrong value.

I concluded it is difficult to use a MPI library thought a shared library. I wonder if someone have more information about it (in this

Not likely.  I would suggest ldd.  It is your friend.

For example:

[EMAIL PROTECTED]:~/workspace/source-mpi$ ldd matmul_mpi_3.exe
        libm.so.6 => /lib/libm.so.6 (0x00002b5409d17000)
        libmpi.so.0 => not found
        libopen-rte.so.0 => not found
        libopen-pal.so.0 => not found
        librt.so.1 => /lib/librt.so.1 (0x00002b5409f99000)
        libdl.so.2 => /lib/libdl.so.2 (0x00002b540a1a2000)
        libnsl.so.1 => /lib/libnsl.so.1 (0x00002b540a3a6000)
        libutil.so.1 => /lib/libutil.so.1 (0x00002b540a5c0000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x00002b540a7c3000)
        libc.so.6 => /lib/libc.so.6 (0x00002b540a9de000)
        /lib64/ld-linux-x86-64.so.2 (0x00002b5409af9000)

Notice that libmpi.so.0 is not found, so I can't run this by hand. Unless I force the issue using LD_LIBRARY_PATH

[EMAIL PROTECTED]:~/workspace/source-mpi$ export LD_LIBRARY_PATH="/home/joe/local/lib64/:/home/joe/local/lib/"
[EMAIL PROTECTED]:~/workspace/source-mpi$ ldd matmul_mpi_3.exe
        libm.so.6 => /lib/libm.so.6 (0x00002ae35ca50000)
libmpi.so.0 => /home/joe/local/lib/libmpi.so.0 (0x00002ae35ccd1000) libopen-rte.so.0 => /home/joe/local/lib/libopen-rte.so.0 (0x00002ae35cfe8000) libopen-pal.so.0 => /home/joe/local/lib/libopen-pal.so.0 (0x00002ae35d2b3000)
        librt.so.1 => /lib/librt.so.1 (0x00002ae35d514000)
        libdl.so.2 => /lib/libdl.so.2 (0x00002ae35d71d000)
        libnsl.so.1 => /lib/libnsl.so.1 (0x00002ae35d921000)
        libutil.so.1 => /lib/libutil.so.1 (0x00002ae35db3b000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x00002ae35dd3e000)
        libc.so.6 => /lib/libc.so.6 (0x00002ae35df59000)
        /lib64/ld-linux-x86-64.so.2 (0x00002ae35c832000)

and it might even run ...

[EMAIL PROTECTED]:~/workspace/source-mpi$ ./matmul_mpi_3.exe
D[tid=0]: running on machine = pegasus-i
D: checking arguments: N_args=1
D: arg[0] = ./matmul_mpi_3.exe
Allocating memory ...
array size in MB = 7.629 MB
 (remember, you have 2 of these)normalization a: 0.05510,  b: 0.00173
0 : loop_min = 0, loop_max = 1000
...

Do you have some sort of LD_LIBRARY_PATH set up? Or something set in /etc/ld.so.config that points to where these things are? Remember, mpirun/mpiexec's alternative purpose in life is to set up the correct run time environment for you, so you might want to see what is going on with the environment in your equivalent command.


_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to