Re: [OMPI users] pgi and gcc runtime compatability

2008-12-07 Thread David Singleton


I seem to remember Fortran logicals being represented differently in
PGI to other Fortran (1 vs -1 maybe - cant remember).  Causes
grief with things like MPI_Test.

David

Brock Palen wrote:
I did something today that I was happy worked,  but I want to know if 
anyone has had problem with it.


At runtime. (not compiling)  would a OpenMPI built with pgi  work to run 
a code that was compiled with the same version but gcc built OpenMPI ?  
I tested a few apps today after I accidentally did this and found it 
worked.  They were all C/C++ apps  (namd and gromacs)  but what about 
fortran apps?   Should we expect problems if someone does this?


I am not going to encourage this, but it is more if needed.


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] pgi and gcc runtime compatability

2008-12-07 Thread Terry Frankcombe
Many of today's compilers for Linux (pgi, intel, etc.) are designed to
be link-compatible with gcc.  That must extend to calling conventions
(mangling schemes and argument passing, etc.)

If it's static link-compatible, surely this applies to dynamic (runtime)
linking (right?)

Is there stuff going on internal to OMPI that requires tighter
integration between app and library than standard function calls tying
together?  How invasive is the memory management stuff?



On Sun, 2008-12-07 at 22:06 -0500, Brock Palen wrote:
> I did something today that I was happy worked,  but I want to know if  
> anyone has had problem with it.
> 
> At runtime. (not compiling)  would a OpenMPI built with pgi  work to  
> run a code that was compiled with the same version but gcc built  
> OpenMPI ?  I tested a few apps today after I accidentally did this  
> and found it worked.  They were all C/C++ apps  (namd and gromacs)   
> but what about fortran apps?   Should we expect problems if someone  
> does this?
> 
> I am not going to encourage this, but it is more if needed.
> 
> 
> Brock Palen
> www.umich.edu/~brockp
> Center for Advanced Computing
> bro...@umich.edu
> (734)936-1985
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



[OMPI users] pgi and gcc runtime compatability

2008-12-07 Thread Brock Palen
I did something today that I was happy worked,  but I want to know if  
anyone has had problem with it.


At runtime. (not compiling)  would a OpenMPI built with pgi  work to  
run a code that was compiled with the same version but gcc built  
OpenMPI ?  I tested a few apps today after I accidentally did this  
and found it worked.  They were all C/C++ apps  (namd and gromacs)   
but what about fortran apps?   Should we expect problems if someone  
does this?


I am not going to encourage this, but it is more if needed.


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985





Re: [OMPI users] Problem with feupdateenv

2008-12-07 Thread Brian Dobbins
Hi Sangamesh,

  I think the problem is that you're loading a different version of OpenMPI
at runtime:

*[master:17781] [ 1] /usr/lib64/openmpi/libmpi.so.0 [0x34b19544b8]*

  .. The path there is to '/usr/lib64/openmpi', which is probably a
system-installed GCC version.  You want to use your version in:

* /opt/openmpi_intel/1.2.8/*

  You probably just need to re-set your LD_LIBRARY_PATH environment variable
to reflect this new path, such as:
*
(for bash)
export LD_LIBRARY_PATH=/opt/openmpi_intel/1.2.8/lib:${LD_LIBRARY_PATH}*

  ... By doing this, it should find the proper library files (assuming
that's the directory they're in - check your instal!).  You may also wish to
remove the old version of OpenMPI that came with the system - a yum 'list'
command should show you the package, and then just remove it.  The
'feupdateenv' thing is more of a red herring, I think... this happens (I
think!) because the system uses a Linux version of the library instead of an
Intel one.  You can add the flag '-shared-intel' to your compile flags or
command line and that should get rid of that, if it bugs you.  Someone else
can, I'm sure, explain in far more detail what the issue there is.

  Hope that helps.. if not, post the output of 'ldd hellompi' here, as well
as an 'ls /opt/openmpi_intel/1.2.8/'

  Cheers!
  - Brian



On Sun, Dec 7, 2008 at 9:50 AM, Sangamesh B  wrote:

> Hello all,
>
> Installed Open MPI 1.2.8 with Intel C++compilers on Cent OS 4.5 based
> Rocks 4.3 linux cluster (& Voltaire infiniband). Installation was
> smooth.
>
> The following error occurred during compilation:
>
> # mpicc hellompi.c -o hellompi
> /opt/intel/cce/10.1.018/lib/libimf.so: warning: warning: feupdateenv
> is not implemented and will always fail
>
> It produced the executable. But during execution it failed with
> Segmentation fault:
>
>  # which mpirun
> /opt/openmpi_intel/1.2.8/bin/mpirun
> # mpirun -np 2 ./hellompi
> ./hellompi: Symbol `ompi_mpi_comm_world' has different size in shared
> object, consider re-linking
> ./hellompi: Symbol `ompi_mpi_comm_world' has different size in shared
> object, consider re-linking
> [master:17781] *** Process received signal ***
> [master:17781] Signal: Segmentation fault (11)
> [master:17781] Signal code: Address not mapped (1)
> [master:17781] Failing at address: 0x10
> [master:17781] [ 0] /lib64/tls/libpthread.so.0 [0x34b150c4f0]
> [master:17781] [ 1] /usr/lib64/openmpi/libmpi.so.0 [0x34b19544b8]
> [master:17781] [ 2]
> /usr/lib64/openmpi/libmpi.so.0(ompi_proc_init+0x14d) [0x34b1954cfd]
> [master:17781] [ 3] /usr/lib64/openmpi/libmpi.so.0(ompi_mpi_init+0xba)
> [0x34b19567da]
> [master:17781] [ 4] /usr/lib64/openmpi/libmpi.so.0(MPI_Init+0x94)
> [0x34b1977ab4]
> [master:17781] [ 5] ./hellompi(main+0x44) [0x401c0c]
> [master:17781] [ 6] /lib64/tls/libc.so.6(__libc_start_main+0xdb)
> [0x34b0e1c3fb]
> [master:17781] [ 7] ./hellompi [0x401b3a]
> [master:17781] *** End of error message ***
> [master:17778] [0,0,0]-[0,1,1] mca_oob_tcp_msg_recv: readv failed:
> Connection reset by peer (104)
> mpirun noticed that job rank 0 with PID 17781 on node master exited on
> signal 11 (Segmentation fault).
> 1 additional process aborted (not shown)
>
> But this is not the case, during non-mpi c code compilation or execution.
>
> # icc sample.c -o sample
> # ./sample
>
> Compiler is working
> #
>
> What might be the reason for this & how it can be resolved?
>
> Thanks,
> Sangamesh
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


[OMPI users] Problem with feupdateenv

2008-12-07 Thread Sangamesh B
Hello all,

Installed Open MPI 1.2.8 with Intel C++compilers on Cent OS 4.5 based
Rocks 4.3 linux cluster (& Voltaire infiniband). Installation was
smooth.

The following error occurred during compilation:

# mpicc hellompi.c -o hellompi
/opt/intel/cce/10.1.018/lib/libimf.so: warning: warning: feupdateenv
is not implemented and will always fail

It produced the executable. But during execution it failed with
Segmentation fault:

 # which mpirun
/opt/openmpi_intel/1.2.8/bin/mpirun
# mpirun -np 2 ./hellompi
./hellompi: Symbol `ompi_mpi_comm_world' has different size in shared
object, consider re-linking
./hellompi: Symbol `ompi_mpi_comm_world' has different size in shared
object, consider re-linking
[master:17781] *** Process received signal ***
[master:17781] Signal: Segmentation fault (11)
[master:17781] Signal code: Address not mapped (1)
[master:17781] Failing at address: 0x10
[master:17781] [ 0] /lib64/tls/libpthread.so.0 [0x34b150c4f0]
[master:17781] [ 1] /usr/lib64/openmpi/libmpi.so.0 [0x34b19544b8]
[master:17781] [ 2]
/usr/lib64/openmpi/libmpi.so.0(ompi_proc_init+0x14d) [0x34b1954cfd]
[master:17781] [ 3] /usr/lib64/openmpi/libmpi.so.0(ompi_mpi_init+0xba)
[0x34b19567da]
[master:17781] [ 4] /usr/lib64/openmpi/libmpi.so.0(MPI_Init+0x94) [0x34b1977ab4]
[master:17781] [ 5] ./hellompi(main+0x44) [0x401c0c]
[master:17781] [ 6] /lib64/tls/libc.so.6(__libc_start_main+0xdb) [0x34b0e1c3fb]
[master:17781] [ 7] ./hellompi [0x401b3a]
[master:17781] *** End of error message ***
[master:17778] [0,0,0]-[0,1,1] mca_oob_tcp_msg_recv: readv failed:
Connection reset by peer (104)
mpirun noticed that job rank 0 with PID 17781 on node master exited on
signal 11 (Segmentation fault).
1 additional process aborted (not shown)

But this is not the case, during non-mpi c code compilation or execution.

# icc sample.c -o sample
# ./sample

Compiler is working
#

What might be the reason for this & how it can be resolved?

Thanks,
Sangamesh