from:"Dmitry N. Mikushin"

[OMPI users] (no subject)

2018-10-31 Thread Dmitry N. Mikushin

Dear all,

ompi_info reports pml components are available:

$ /usr/mpi/gcc/openmpi-3.1.0rc2/bin/ompi_info -a | grep pml
 MCA pml: v (MCA v2.1.0, API v2.0.0, Component v3.1.0)
 MCA pml: monitoring (MCA v2.1.0, API v2.0.0, Component
v3.1.0)
 MCA pml: yalla (MCA v2.1.0, API v2.0.0, Component v3.1.0)
 MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v3.1.0)
 MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v3.1.0)
 MCA pml: ucx (MCA v2.1.0, API v2.0.0, Component v3.1.0)

However, when I'm trying to use them, mpirun gives back:

--
No components were able to be opened in the pml framework.

This typically means that either no components of this type were
installed, or none of the installed components can be loaded.
Sometimes this means that shared libraries required by these
components are unable to be found/loaded.

  Host:  cloudgpu6
  Framework: pml
--

With the strace I can see the
libraries /usr/mpi/gcc/openmpi-3.1.0rc2/lib64/openmpi/mca_pml_* are reached
out by mpirun. Then I also can see ldd does not show any unresolved
dependencies for them.

How else could it be the that pml is not found?

Thanks,
- Dmitry.
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] EBADF (Bad file descriptor) on a simplest "Hello world" program

2018-06-02 Thread Dmitry N. Mikushin

ping

2018-06-01 22:29 GMT+03:00 Dmitry N. Mikushin :

> Dear all,
>
> Looks like I have a weird issue never encountered before. While trying to
> run simplest "Hello world" program, I get:
>
> $ cat hello.c
> #include 
>
> int main(int argc, char* argv[])
> {
> MPI_Init(&argc, &argv);
>
> MPI_Finalize();
>
> return 0;
> }
> $ mpicc hello.c -o hello
> $ mpirun -np 1 ./hello
> --
> WARNING: The accept(3) system call failed on a TCP socket.  While this
> should generally never happen on a well-configured HPC system, the
> most common causes when it does occur are:
>
>   * The process ran out of file descriptors
>   * The operating system ran out of file descriptors
>   * The operating system ran out of memory
>
> Your Open MPI job will likely hang until the failure resason is fixed
> (e.g., more file descriptors and/or memory becomes available), and may
> eventually timeout / abort.
>
>   Local host: M17xR4
>   Errno:  9 (Bad file descriptor)
>   Probable cause: Unknown cause; job will try to continue
> --
>
> Further tracing shows the following:
>
> [pid 13498] accept(0, 0x7f2ec8000960, 0x7f2ee6740e7c) = -1 EBADF (Bad file
> descriptor)
> [pid 13498] shutdown(0, SHUT_RDWR)  = -1 EBADF (Bad file descriptor)
> [pid 13498] close(0)= -1 EBADF (Bad file descriptor)
> [pid 13498] open("/usr/share/openmpi/help-oob-tcp.txt", O_RDONLY) = 0
> [pid 13498] ioctl(0, TCGETS, 0x7f2ee6740be0) = -1 ENOTTY (Inappropriate
> ioctl for device)
> [pid 13499] <... nanosleep resumed> NULL) = 0
> [pid 13498] fstat(0,  
> [pid 13499] nanosleep({0, 10},  
> [pid 13498] <... fstat resumed> {st_mode=S_IFREG|0644, st_size=3025, ...})
> = 0
> [pid 13498] read(0, "# -*- text -*-\n#\n# Copyright (c)"..., 8192) = 3025
> [pid 13498] read(0, "", 4096)   = 0
> [pid 13498] read(0, "", 8192)   = 0
> [pid 13498] ioctl(0, TCGETS, 0x7f2ee6740b40) = -1 ENOTTY (Inappropriate
> ioctl for device)
> [pid 13498] close(0)= 0
> [pid 13499] <... nanosleep resumed> NULL) = 0
> [pid 13499] nanosleep({0, 10},  
> [pid 13498] write(1, ""...,
> 768-
> -
> WARNING: The accept(3) system call failed on a TCP socket.  While this
> should generally never happen on a well-configured HPC system, the
> most common causes when it does occur are:
>
>   * The process ran out of file descriptors
>   * The operating system ran out of file descriptors
>   * The operating system ran out of memory
>
> Your Open MPI job will likely hang until the failure resason is fixed
> (e.g., more file descriptors and/or memory becomes available), and may
> eventually timeout / abort.
>
>   Local host: M17xR4
>   Errno:  9 (Bad file descriptor)
>   Probable cause: Unknown cause; job will try to continue
> --
> ) = 768
>
> In fact, "Bad file descriptor" first occurs a bit earlier, here:
>
> [pid 13499] open("/proc/self/fd", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC)
> = 20
> [pid 13499] fstat(20, {st_mode=S_IFDIR|0500, st_size=0, ...}) = 0
> [pid 13499] getdents(20, /* 25 entries */, 32768) = 600
> [pid 13499] close(3)= 0
> [pid 13499] close(4)= 0
> [pid 13499] close(5)= 0
> [pid 13499] close(6)= 0
> [pid 13499] close(7)= 0
> [pid 13499] close(8)= 0
> [pid 13499] close(9)= 0
> [pid 13499] close(10)   = 0
> [pid 13499] close(11)   = 0
> [pid 13499] close(12)   = 0
> [pid 13499] close(13)   = 0
> [pid 13499] close(14)   = 0
> [pid 13499] close(15)   = 0
> [pid 13499] close(16)   = 0
> [pid 13499] close(17)   = 0
> [pid 13499] close(18)   = 0
> [pid 13499] close(19)   = 0
> [pid 13499] close(20)   = 0
> [pid 13499] getdents(20, 0x1cc04a0, 32768) = -1 EBADF (Bad file descriptor)
> [pid 13499] close(20)   = -1 EBADF (Bad file descriptor)
>
> Any idea how to fix this? System is Ubuntu 16.04:
>
> Linux M17xR4 4.10.0-42-generic #46~16.04.1-Ubuntu SMP Mon Dec 4 15:57:59
> UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>
> Kind regards,
> - Dmitry.
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] EBADF (Bad file descriptor) on a simplest "Hello world" program

2018-06-01 Thread Dmitry N. Mikushin

Dear all,

Looks like I have a weird issue never encountered before. While trying to
run simplest "Hello world" program, I get:

$ cat hello.c
#include 

int main(int argc, char* argv[])
{
MPI_Init(&argc, &argv);

MPI_Finalize();

return 0;
}
$ mpicc hello.c -o hello
$ mpirun -np 1 ./hello
--
WARNING: The accept(3) system call failed on a TCP socket.  While this
should generally never happen on a well-configured HPC system, the
most common causes when it does occur are:

  * The process ran out of file descriptors
  * The operating system ran out of file descriptors
  * The operating system ran out of memory

Your Open MPI job will likely hang until the failure resason is fixed
(e.g., more file descriptors and/or memory becomes available), and may
eventually timeout / abort.

  Local host: M17xR4
  Errno:  9 (Bad file descriptor)
  Probable cause: Unknown cause; job will try to continue
--

Further tracing shows the following:

[pid 13498] accept(0, 0x7f2ec8000960, 0x7f2ee6740e7c) = -1 EBADF (Bad file
descriptor)
[pid 13498] shutdown(0, SHUT_RDWR)  = -1 EBADF (Bad file descriptor)
[pid 13498] close(0)= -1 EBADF (Bad file descriptor)
[pid 13498] open("/usr/share/openmpi/help-oob-tcp.txt", O_RDONLY) = 0
[pid 13498] ioctl(0, TCGETS, 0x7f2ee6740be0) = -1 ENOTTY (Inappropriate
ioctl for device)
[pid 13499] <... nanosleep resumed> NULL) = 0
[pid 13498] fstat(0,  
[pid 13499] nanosleep({0, 10},  
[pid 13498] <... fstat resumed> {st_mode=S_IFREG|0644, st_size=3025, ...})
= 0
[pid 13498] read(0, "# -*- text -*-\n#\n# Copyright (c)"..., 8192) = 3025
[pid 13498] read(0, "", 4096)   = 0
[pid 13498] read(0, "", 8192)   = 0
[pid 13498] ioctl(0, TCGETS, 0x7f2ee6740b40) = -1 ENOTTY (Inappropriate
ioctl for device)
[pid 13498] close(0)= 0
[pid 13499] <... nanosleep resumed> NULL) = 0
[pid 13499] nanosleep({0, 10},  
[pid 13498] write(1, ""...,
768--
WARNING: The accept(3) system call failed on a TCP socket.  While this
should generally never happen on a well-configured HPC system, the
most common causes when it does occur are:

  * The process ran out of file descriptors
  * The operating system ran out of file descriptors
  * The operating system ran out of memory

Your Open MPI job will likely hang until the failure resason is fixed
(e.g., more file descriptors and/or memory becomes available), and may
eventually timeout / abort.

  Local host: M17xR4
  Errno:  9 (Bad file descriptor)
  Probable cause: Unknown cause; job will try to continue
--
) = 768

In fact, "Bad file descriptor" first occurs a bit earlier, here:

[pid 13499] open("/proc/self/fd",
O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 20
[pid 13499] fstat(20, {st_mode=S_IFDIR|0500, st_size=0, ...}) = 0
[pid 13499] getdents(20, /* 25 entries */, 32768) = 600
[pid 13499] close(3)= 0
[pid 13499] close(4)= 0
[pid 13499] close(5)= 0
[pid 13499] close(6)= 0
[pid 13499] close(7)= 0
[pid 13499] close(8)= 0
[pid 13499] close(9)= 0
[pid 13499] close(10)   = 0
[pid 13499] close(11)   = 0
[pid 13499] close(12)   = 0
[pid 13499] close(13)   = 0
[pid 13499] close(14)   = 0
[pid 13499] close(15)   = 0
[pid 13499] close(16)   = 0
[pid 13499] close(17)   = 0
[pid 13499] close(18)   = 0
[pid 13499] close(19)   = 0
[pid 13499] close(20)   = 0
[pid 13499] getdents(20, 0x1cc04a0, 32768) = -1 EBADF (Bad file descriptor)
[pid 13499] close(20)   = -1 EBADF (Bad file descriptor)

Any idea how to fix this? System is Ubuntu 16.04:

Linux M17xR4 4.10.0-42-generic #46~16.04.1-Ubuntu SMP Mon Dec 4 15:57:59
UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Kind regards,
- Dmitry.
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Crash in libopen-pal.so

2017-06-19 Thread Dmitry N. Mikushin

Hi Justin,

If you can build application in debug mode, try inserting valgrind into
your MPI command. It's usually very good in tracking down failing memory
allocations origins.

Kind regards,
- Dmitry.


2017-06-20 1:10 GMT+03:00 Sylvain Jeaugey :

> Justin, can you try setting mpi_leave_pinned to 0 to disable libptmalloc2
> and confirm this is related to ptmalloc ?
>
> Thanks,
> Sylvain
> On 06/19/2017 03:05 PM, Justin Luitjens wrote:
>
> I have an application that works on other systems but on the current
> system I’m running I’m seeing the following crash:
>
>
>
> [dt04:22457] *** Process received signal ***
>
> [dt04:22457] Signal: Segmentation fault (11)
>
> [dt04:22457] Signal code: Address not mapped (1)
>
> [dt04:22457] Failing at address: 0x6a1da250
>
> [dt04:22457] [ 0] /lib64/libpthread.so.0(+0xf370)[0x2b353370]
>
> [dt04:22457] [ 1] /home/jluitjens/libs/openmpi/lib/libopen-pal.so.13(opal_
> memory_ptmalloc2_int_free+0x50)[0x2cbcf810]
>
> [dt04:22457] [ 2] /home/jluitjens/libs/openmpi/lib/libopen-pal.so.13(opal_
> memory_ptmalloc2_free+0x9b)[0x2cbcff3b]
>
> [dt04:22457] [ 3] ./hacc_tpm[0x42f068]
>
> [dt04:22457] [ 4] ./hacc_tpm[0x42f231]
>
> [dt04:22457] [ 5] ./hacc_tpm[0x40f64d]
>
> [dt04:22457] [ 6] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2c30db35]
>
> [dt04:22457] [ 7] ./hacc_tpm[0x4115cf]
>
> [dt04:22457] *** End of error message ***
>
>
>
>
>
> This app is a CUDA app but doesn’t use GPU direct so that should be
> irrelevant.
>
>
>
> I’m building with ggc/5.3.0  cuda/8.0.44  openmpi/1.10.7
>
>
>
> I’m using this on centos 7 and am using a vanilla MPI configure line:
> ./configure --prefix=/home/jluitjens/libs/openmpi/
>
>
>
> Currently I’m trying to do this with just a single MPI process but
> multiple MPI processes fail in the same way:
>
>
>
> mpirun  --oversubscribe -np 1 ./command
>
>
>
> What is odd is the crash occurs around the same spot in the code but not
> consistently at the same spot.  The spot in the code where the single
> thread is at the time of the crash is nowhere near MPI code.  The code
> where it is crashing is just using malloc to allocate some memory. This
> makes me think the crash is due to a thread outside of the application I’m
> working on (perhaps in OpenMPI itself) or perhaps due to openmpi hijacking
> malloc/free.
>
>
>
> Does anyone have any ideas of what I could try to work around this issue?
>
>
>
> Thanks,
>
> Justin
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
> This email message is for the sole use of the intended recipient(s) and
> may contain confidential information.  Any unauthorized review, use,
> disclosure or distribution is prohibited.  If you are not the intended
> recipient, please contact the sender by reply email and destroy all copies
> of the original message.
> --
>
>
> ___
> users mailing 
> listus...@lists.open-mpi.orghttps://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI + system() call + Matlab MEX crashes

2016-10-05 Thread Dmitry N. Mikushin

Hi Juraj,

Although MPI infrastructure may technically support forking, it's known
that not all system resources can correctly replicate themselves to forked
process. For example, forking inside MPI program with active CUDA driver
will result into crash.

Why not to compile down the MATLAB into a native library and link it with
the MPI application directly? E.g. like here:
https://www.mathworks.com/matlabcentral/answers/98867-how-do-i-create-a-c-shared-library-from-mex-files-using-the-matlab-compiler?requestedDomain=www.mathworks.com

Kind regards,
- Dmitry Mikushin.


2016-10-05 11:32 GMT+03:00 juraj2...@gmail.com :

> Hello,
>
> I have an application in C++(main.cpp) that is launched with multiple
> processes via mpirun. Master process calls matlab via system('matlab
> -nosplash -nodisplay -nojvm -nodesktop -r "interface"'), which executes
> simple script interface.m that calls mexFunction (mexsolve.cpp) from which
> I try to set up communication with the rest of the processes launched at
> the beginning together with the master process. When I run the application
> as listed below on two different machines I experience:
>
> 1) crash at MPI_Init() in the mexFunction() on cluster machine with
> Linux 4.4.0-22-generic
>
> 2) error in MPI_Send() shown below on local machine with
> Linux 3.10.0-229.el7.x86_64
> [archimedes:31962] shmem: mmap: an error occurred while determining
> whether or not 
> /tmp/openmpi-sessions-1007@archimedes_0/58444/1/shared_mem_pool.archimedes
> could be created.
> [archimedes:31962] create_and_attach: unable to create shared memory BTL
> coordinating structure :: size 134217728
> [archimedes:31962] shmem: mmap: an error occurred while determining
> whether or not /tmp/openmpi-sessions-1007@arc
> himedes_0/58444/1/0/vader_segment.archimedes.0 could be created.
> [archimedes][[58444,1],0][../../../../../opal/mca/btl/tcp/bt
> l_tcp_endpoint.c:800:mca_btl_tcp_endpoint_complete_connect] connect() to
>  failed: Connection refused (111)
>
> I launch application as following:
> mpirun --mca mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 1  -np
> 2 -npernode 1 ./main
>
> I have openmpi-2.0.1 configured with --prefix=${INSTALLDIR}
> --enable-mpi-fortran=all --with-pmi --disable-dlopen
>
> For more details, the code is here: https://github.com/goghino/matlabMpiC
>
> Thanks for any suggestions!
>
> Juraj
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI_File_write hangs on NFS-mounted filesystem

2013-11-07 Thread Dmitry N. Mikushin

Not sure if this is related, but:

I've seen a case of performance degradation on NFS and Lustre when
writing NetCDF files. The reason was that the file was filled with a
loop writing one 4-byte record at once. Performance became close to
local hard drive, when I simply introduced buffering of records and
writing them to files with one row at once.

- D.


2013/11/7 Steven G Johnson :
> The simple C program attached below hangs on MPI_File_write when I am using 
> an NFS-mounted filesystem.   Is MPI-IO supported in OpenMPI for NFS 
> filesystems?
>
> I'm using OpenMPI 1.4.5 on Debian stable (wheezy), 64-bit Opteron CPU, Linux 
> 3.2.51.   I was surprised by this because the problems only started occurring 
> recently when I upgraded my Debian system to wheezy; with OpenMPI in the 
> previous Debian release, output to NFS-mounted filesystems worked fine.
>
> Is there any easy way to get this working?  Any tips are appreciated.
>
> Regards,
> Steven G. Johnson
>
> ---
> #include 
> #include 
> #include 
>
> void perr(const char *label, int err)
> {
> char s[MPI_MAX_ERROR_STRING];
> int len;
> MPI_Error_string(err, s, &len);
> printf("%s: %d = %s\n", label, err, s);
> }
>
> int main(int argc, char **argv)
> {
> MPI_Init(&argc, &argv);
>
> MPI_File fh;
> int err;
> err = MPI_File_open(MPI_COMM_WORLD, "tstmpiio.dat", MPI_MODE_CREATE | 
> MPI_MODE_WRONLY, MPI_INFO_NULL, &fh);
> perr("open", err);
>
> const char s[] = "Hello world!\n";
> MPI_Status status;
> err = MPI_File_write(fh, (void*) s, strlen(s), MPI_CHAR, &status);
> perr("write", err);
>
> err = MPI_File_close(&fh);
> perr("close", err);
>
> MPI_Finalize();
> return 0;
> }
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Stream interactions in CUDA

2012-12-12 Thread Dmitry N. Mikushin

Hi Justin,

Quick grepping reveals several cuMemcpy calls in OpenMPI. Some of them are
even synchronous, meaning stream0.

I think the best way of exploring this sort of behavior is to execute
OpenMPI runtime (thanks to its open-source nature!) under debugger. Rebuild
OpenMPI with -g -O0, add some initial sleep() into your app, such that this
time would be sufficient to gdb-attach to one of MPI processes. Once
attached, first put break on the beginning of your region of interest and
then break on cuMemcpy and cuMemcpyAsync.

Best,
- D.

2012/12/13 Justin Luitjens 

> Hello,
>
> I'm working on an application using OpenMPI with CUDA and GPUDirect.  I
> would like to get the MPI transfers to overlap with computation on the CUDA
> device.  To do this I need to ensure that all memory transfers do not go to
> stream 0.  In this application I have one step that performs an
> MPI_Alltoall operation.  Ideally I would like this Alltoall operation to be
> asynchronous.  Thus I have implemented my own Alltoall using Isend and
> Irecv.  Which can be found at the bottom of this email.
>
> The profiler shows that this operation has some very odd PCI-E traffic
> that I was hoping someone could explain and help me eliminate.  In this
> example NPES=2 and each process has its own M2090 GPU.  I am using cuda 5.0
> and OpenMPI-1.7rc5.  The behavior I am seeing is the following.  Once the
> Isend loop occurs there is a sequence of DtoH followed by HtoD transfers.
>  These transfers are 256K in size and there are 28 of them that occur.
>  Each of these transfers are placed in stream0.  After this there are a few
> more small transfers also placed in stream0.  Finally when the 3rd loop
> occurs there are 2 DtoD transfers (this is the actual data being exchanged).
>
> Can anyone explain what all of the traffic ping-ponging back and forth
> between the host and device is?  Is this traffic necessary?
>
> Thanks,
> Justin
>
>
> uint64_t scatter_gather( uint128 * input_buffer, uint128 *output_buffer,
> uint128 *recv_buckets, int* send_sizes, int MAX_RECV_SIZE_PER_PE) {
>
>   std::vector srequest(NPES), rrequest(NPES);
>
>   //Start receives
>   for(int p=0;p
> MPI_Irecv(recv_buckets+MAX_RECV_SIZE_PER_PE*p,MAX_RECV_SIZE_PER_PE,MPI_INT_128,p,0,MPI_COMM_WORLD,&rrequest[p]);
>   }
>
>   //Start sends
>   int send_count=0;
>   for(int p=0;p
> MPI_Isend(input_buffer+send_count,send_sizes[p],MPI_INT_128,p,0,MPI_COMM_WORLD,&srequest[p]);
> send_count+=send_sizes[p];
>   }
>
>   //Process outstanding receives
>   int recv_count=0;
>   for(int p=0;p MPI_Status status;
> MPI_Wait(&rrequest[p],&status);
> int count;
> MPI_Get_count(&status,MPI_INT_128,&count);
> assert(count
> cudaMemcpy(output_buffer+recv_count,recv_buckets+MAX_RECV_SIZE_PER_PE*p,count*sizeof(uint128),cudaMemcpyDeviceToDevice);
> recv_count+=count;
>   }
>
>   //Wait for outstanding sends
>   for(int p=0;p MPI_Status status;
> MPI_Wait(&srequest[p],&status);
>   }
>   return recv_count;
> }
>
>
> ---
> This email message is for the sole use of the intended recipient(s) and
> may contain
> confidential information.  Any unauthorized review, use, disclosure or
> distribution
> is prohibited.  If you are not the intended recipient, please contact the
> sender by
> reply email and destroy all copies of the original message.
>
> ---
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] fork in Fortran

2012-08-30 Thread Dmitry N. Mikushin

Hi,

Modern Fortran has a feature called ISO_C_BINDING. It essentially
allows to declare a binding of external function to be used from
Fortran program. You only need to provide a corresponding interface.
ISO_C_BINDING module contains C-like extensions in type system, but
you don't need them, as your function has no arguments :)

Example:

program fork_test

interface

function fork() bind(C)
use iso_c_binding
integer(c_int) :: fork
end function fork

end interface

print *, 'My PID = ', fork()

end program fork_test

$ make
gfortran fork.f90 -o fork
$ ./fork
 My PID = 4033
 My PID =0

For further info, please refer to language standard
http://gcc.gnu.org/wiki/GFortranStandards#Fortran_2003
If you have any questions - consider asking the gfortran community,
they are very friendly.

Best,
- D.

2012/8/30 sudhirs@ :
> Dear users,
>  How to use fork(), vfork() type functions in Fortran programming ??
>
> Thanking you in advance
>
> --
> Sudhir Kumar Sahoo
> Ph.D Scholar
> Dept. Of Chemistry
> IIT Kanpur-208016
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] bug in CUDA support for dual-processor systems?

2012-08-02 Thread Dmitry N. Mikushin

Hi Zbigniew,

> a) I noticed that on my 6-GPU 2-CPU  platform the initialization of CUDA 4.2 
> takes a long time, approx 10 seconds.
> Do you think I should report this as a bug to nVidia?

This is an expected time for creation of driver contexts on so many
devices. I'm sure NVIDIA already got thousands of reports on this :)
The typical answer is: keep alive context on GPU either by running an
X server or by executing "nvidia-smi -l 1" in background. With one of
these init time should drop down to ~1 sec or less.

- D.

2012/7/31 Zbigniew Koza :
> Thanks for a quick reply.
>
> I do not know much about low-level CUDA and IPC,
> but there's no problem using high-level CUDA to determine if
> device A can talk to B via GPUDirect (cudaDeviceCanAccessPeer).
> Then, for such connections, one only needs to call
> cudaDeviceEnablePeerAccess
> and then essentially  "sit back and laugh" -  given correct current device
> and stream, functions like cudaMemcpyPeer work irrespectively of whether
> GPUDirect
> is on or off for a given pair of devices, the only difference being the
> speed.
> So, I hope it should be possible to implement device-IOH-IOH-device
> communication using low-level CUDA.
> Such functionality should be an important step in the "CPU-GPU
> high-performance war" :-),
> as  8-GPU fast-MPI-link systems  bring a new meaning to a "GPU node" in GPU
> clusters...
>
> Here is the output of my test program that was aimed at determining
> a) aggregate, best-case transfer rate between 6 GPUs running in parallel and
> b) whether devices on different IOHs can talk to each other:
>
> 3 [GB] in  78.6952 [ms] =  38.1218 GB/s (aggregate)
> sending 6 bytes from device 0:
> 0 -> 0: 11.3454 [ms] 52.8848 GB/s
> 0 -> 1: 90.3628 [ms] 6.6399 GB/s
> 0 -> 2: 113.396 [ms] 5.29117 GB/s
> 0 -> 3: 113.415 [ms] 5.29032 GB/s
> 0 -> 4: 170.307 [ms] 3.52305 GB/s
> 0 -> 5: 169.613 [ms] 3.53747 GB/s
>
> This shows that even if devices are on different IOHs, like 0 and 4, they
> can talk to each other at a fantastic speed of 3.5 GB/s
> and it would be pity if OpenMPI did not used this opportunity.
>
> I have also 2 questions:
>
> a) I noticed that on my 6-GPU 2-CPU  platform the initialization of CUDA 4.2
> takes a long time, approx 10 seconds.
> Do you think I should report this as a bug to nVidia?
>
> b) Is there any info on running OpenMPI + CUDA? For example, what are the
> dependencies of transfer rates and latencies on transfer size?
> A dedicated www page, blog or whatever? How can I know if the current
> problem was solved?
>
>
>
> Many thanks for making CUDA available in OpenMPI.
>
> Regards
>
> Z Koza
>
> W dniu 31.07.2012 19:39, Rolf vandeVaart pisze:
>
>> The current implementation does assume that the GPUs are on the same IOH
>> and therefore can use the IPC features of the CUDA library for
>> communication.
>> One of the initial motivations for this was that to be able to detect
>> whether GPUs can talk to one another, the CUDA library has to be initialized
>> and the GPUs have to be selected by each rank.  It is at that point that we
>> can determine whether the IPC will work between the GPUs.However, this
>> means that the GPUs need to be selected by each rank prior to the call to
>> MPI_Init as that is where we determine whether IPC is possible, and we were
>> trying to avoid that requirement.
>>
>> I will submit a ticket against this and see if we can improve this.
>>
>> Rolf
>>
>>> -Original Message-
>>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]
>>> On Behalf Of Zbigniew Koza
>>> Sent: Tuesday, July 31, 2012 12:38 PM
>>> To: us...@open-mpi.org
>>> Subject: [OMPI users] bug in CUDA support for dual-processor systems?
>>>
>>> Hi,
>>>
>>> I wrote a simple program to see if OpenMPI can really handle cuda
>>> pointers as
>>> promised in the FAQ and how efficiently.
>>> The program (see below) breaks if MPI communication is to be performed
>>> between two devices that are on the same node but under different IOHs in
>>> a
>>> dual-processor Intel machine.
>>> Note that  cudaMemCpy works for such devices, although not as efficiently
>>> as
>>> for the devices on the same IOH and GPUDirect enabled.
>>>
>>> Here's the output from my program:
>>>
>>> ===
>>>
   mpirun -n 6 ./a.out
>>>
>>> Init
>>> Init
>>> Init
>>> Init
>>> Init
>>> Init
>>> rank: 1, size: 6
>>> rank: 2, size: 6
>>> rank: 3, size: 6
>>> rank: 4, size: 6
>>> rank: 5, size: 6
>>> rank: 0, size: 6
>>> device 3 is set
>>> Process 3 is on typhoon1
>>> Using regular memory
>>> device 0 is set
>>> Process 0 is on typhoon1
>>> Using regular memory
>>> device 4 is set
>>> Process 4 is on typhoon1
>>> Using regular memory
>>> device 1 is set
>>> Process 1 is on typhoon1
>>> Using regular memory
>>> device 5 is set
>>> Process 5 is on typhoon1
>>> Using regular memory
>>> device 2 is set
>>> Process 2 is on typhoon1
>>> Using regular memory
>>> ^C^[[A^C
>>> zkoza@typhoon1:~/multigpu$
>

Re: [OMPI users] undefined reference to `netcdf_mp_nf90_open_'

2012-06-26 Thread Dmitry N. Mikushin

Dear Syed,

Why do you think it is related to MPI?

You seem to be compiling the COSMO model, which depends on netcdf lib, but
the symbols are not passed to linker by some reason. Two main reasons are:
(1) the library linking flag is missing (check you have something like
-lnetcdf -lnetcdff in your linker command line), (2) The netcdf Fortran
bindings are compiled with a different naming notation (check names in the
lib really contain the expected number of final underscores).

I compiled cosmo 4.22 with openmpi and netcdf not long ago without any
problems.

Best,
- Dima.

2012/6/26 Syed Ahsan Ali 

> Dear All
>
> I am getting following error while compilation of an application. Seems
> like something related to netcdf and mpif90. Although I have compiled
> netcdf with mpif90 option, dont why this error is happening. Any hint would
> be highly appreciated.
>
>
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/obj/src_obs_proc_cdf.o: In
> function `src_obs_proc_cdf_mp_obs_cdf_read_org_':
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x17aa):
> undefined reference to `netcdf_mp_nf90_open_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/obj/src_obs_proc_cdf.o: In
> function `src_obs_proc_cdf_mp_obs_cdf_read_temp_pilot_':
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x1000e):
> undefined reference to `netcdf_mp_nf90_inq_varid_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10039):
> undefined reference to `netcdf_mp_nf90_inq_varid_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10064):
> undefined reference to `netcdf_mp_nf90_inq_varid_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x1008b):
> undefined reference to `netcdf_mp_nf90_inq_varid_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x100c8):
> undefined reference to `netcdf_mp_nf90_inq_varid_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10227):
> undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x102eb):
> undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x103af):
> undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10473):
> undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10559):
> undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10890):
> undefined reference to `netcdf_mp_nf90_inq_varid_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x108bb):
> undefined reference to `netcdf_mp_nf90_inq_varid_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x108e2):
> undefined reference to `netcdf_mp_nf90_inq_varid_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10909):
> undefined reference to `netcdf_mp_nf90_inq_varid_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10930):
> undefined reference to `netcdf_mp_nf90_inq_varid_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/obj/src_obs_proc_cdf.o:/home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x109e8):
> more undefined references to `netcdf_mp_nf90_inq_varid_' follow
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/obj/src_obs_proc_cdf.o: In
> function `src_obs_proc_cdf_mp_obs_cdf_read_temp_pilot_':
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10abc):
> undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10b8c):
> undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10c5c):
> undefined reference to `netcdf_mp_nf90_get_var_1d_eightbytereal_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10d2c):
> undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10dfc):
> undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10ecc):
> undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10ef3):
> undefined reference to `netcdf_mp_nf90_inq_varid_'
>
> /home/pmdtest/cosmo/source/cosmo_110525_4.18/src

Re: [OMPI users] NVCC mpi.h: error: attribute "deprecated" does not take arguments

2012-06-19 Thread Dmitry N. Mikushin

Dear Rolf,

I compiled openmpi-trunk with $ ../configure --prefix=/opt/openmpi-trunk
--disable-mpi-interface-warning --with-cuda=/opt/cuda
And that error is now gone!

Thanks a lot for your assistance,
- D.

2012/6/19 Rolf vandeVaart 

> Dmitry:
>
> ** **
>
> It turns out that by default in Open MPI 1.7, configure enables warnings
> for deprecated MPI functionality.  In Open MPI 1.6, these warnings were
> disabled by default.
>
> That explains why you would not see this issue in the earlier versions of
> Open MPI.
>
> ** **
>
> I assume that gcc must have added support for
> __attribute__((__deprecated__)) and then later on
> __attribute__((__deprecated__(msg))) and your version of gcc supports both
> of these.  (My version of gcc, 4.5.1 does not support the msg in the
> attribute)
>
> ** **
>
> The version of nvcc you have does not support the "msg" argument so
> everything blows up.
>
> ** **
>
> I suggest you configure with -disable-mpi-interface-warning which will
> prevent any of the deprecated attributes from being used and then things
> should work fine.
>
> ** **
>
> Let me know if this fixes your problem.
>
> ** **
>
> Rolf
>
> ** **
>
> *From:* users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] *On
> Behalf Of *Rolf vandeVaart
> *Sent:* Monday, June 18, 2012 11:00 AM
>
> *To:* Open MPI Users
> *Cc:* Олег Рябков
> *Subject:* Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__"
> does not take arguments
>
> ** **
>
> Hi Dmitry:
>
> Let me look into this.
>
> ** **
>
> Rol*f*
>
> ** **
>
> *From:* users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] *On
> Behalf Of *Dmitry N. Mikushin
> *Sent:* Monday, June 18, 2012 10:56 AM
> *To:* Open MPI Users
> *Cc:* Олег Рябков
> *Subject:* Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__"
> does not take arguments
>
> ** **
>
> Yeah, definitely. Thank you, Jeff.
>
> - D.
>
> 2012/6/18 Jeff Squyres 
>
> On Jun 18, 2012, at 10:41 AM, Dmitry N. Mikushin wrote:
>
> > No, I'm configuring with gcc, and for openmpi-1.6 it works with nvcc
> without a problem.
>
> Then I think Rolf (from Nvidia) should figure this out; I don't have
> access to nvcc.  :-)
>
>
> > Actually, nvcc always meant to be more or less compatible with gcc, as
> far as I know. I'm guessing in case of trunk nvcc is the source of the
> issue.
> >
> > And with ./configure CC=nvcc etc. it won't build:
> >
> /home/dmikushin/forge/openmpi-trunk/opal/mca/event/libevent2019/libevent/include/event2/util.h:126:2:
> error: #error "No way to define ev_uint64_t"
>
> You should complain to Nvidia about that.
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ** **
> --
>
> This email message is for the sole use of the intended recipient(s) and
> may contain confidential information.  Any unauthorized review, use,
> disclosure or distribution is prohibited.  If you are not the intended
> recipient, please contact the sender by reply email and destroy all copies
> of the original message. 
> --
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] NVCC mpi.h: error: attribute "deprecated" does not take arguments

2012-06-18 Thread Dmitry N. Mikushin

Yeah, definitely. Thank you, Jeff.

- D.

2012/6/18 Jeff Squyres 

> On Jun 18, 2012, at 10:41 AM, Dmitry N. Mikushin wrote:
>
> > No, I'm configuring with gcc, and for openmpi-1.6 it works with nvcc
> without a problem.
>
> Then I think Rolf (from Nvidia) should figure this out; I don't have
> access to nvcc.  :-)
>
> > Actually, nvcc always meant to be more or less compatible with gcc, as
> far as I know. I'm guessing in case of trunk nvcc is the source of the
> issue.
> >
> > And with ./configure CC=nvcc etc. it won't build:
> >
> /home/dmikushin/forge/openmpi-trunk/opal/mca/event/libevent2019/libevent/include/event2/util.h:126:2:
> error: #error "No way to define ev_uint64_t"
>
> You should complain to Nvidia about that.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] NVCC mpi.h: error: attribute "deprecated" does not take arguments

2012-06-18 Thread Dmitry N. Mikushin

No, I'm configuring with gcc, and for openmpi-1.6 it works with nvcc
without a problem.
Actually, nvcc always meant to be more or less compatible with gcc, as far
as I know. I'm guessing in case of trunk nvcc is the source of the issue.

And with ./configure CC=nvcc etc. it won't build:
/home/dmikushin/forge/openmpi-trunk/opal/mca/event/libevent2019/libevent/include/event2/util.h:126:2:
error: #error "No way to define ev_uint64_t"

Thanks,
- D.

2012/6/18 Jeff Squyres 

> Did you configure and build Open MPI with nvcc?
>
> I ask because Open MPI should auto-detect whether the underlying compiler
> can handle a message argument with the deprecated directive or not.
>
> You should be able to build Open MPI with:
>
>./configure CC=nvcc etc.
>make clean all install
>
> If you're building Open MPI with one compiler and then trying to compile
> with another (like the command line in your mail implies), all bets are off
> because Open MPI has tuned itself to the compiler that it was configured
> with.
>
>
>
>
> On Jun 18, 2012, at 10:20 AM, Dmitry N. Mikushin wrote:
>
> > Hello,
> >
> > With openmpi svn trunk as of
> >
> > Repository Root: http://svn.open-mpi.org/svn/ompi
> > Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe
> > Revision: 26616
> >
> > we are observing the following strange issue (see below). How do you
> think, is it a problem of NVCC or OpenMPI?
> >
> > Thanks,
> > - Dima.
> >
> > [dmikushin@tesla-apc mpitest]$ cat mpitest.cu
> > #include 
> >
> > __global__ void kernel() { }
> >
> > [dmikushin@tesla-apc mpitest]$ nvcc -I/opt/openmpi-trunk/include -c
> mpitest.cu
> > /opt/openmpi-trunk/include/mpi.h(365): error: attribute "__deprecated__"
> does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(374): error: attribute "__deprecated__"
> does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(382): error: attribute "__deprecated__"
> does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(724): error: attribute "__deprecated__"
> does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(730): error: attribute "__deprecated__"
> does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(736): error: attribute "__deprecated__"
> does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(790): error: attribute "__deprecated__"
> does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(791): error: attribute "__deprecated__"
> does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1049): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1070): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1072): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1074): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1145): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1149): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1151): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1345): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1347): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1484): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1507): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1510): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1515): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1525): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1527): error: attribute
> "__deprecated__" does not take arguments
> >
> > /opt/openmpi-trunk/include/mpi.h(1589): error: attribute
> "__deprecated__" does not take ar

[OMPI users] NVCC mpi.h: error: attribute "deprecated" does not take arguments

2012-06-18 Thread Dmitry N. Mikushin

Hello,

With openmpi svn trunk as of

Repository Root: http://svn.open-mpi.org/svn/ompi
Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe
Revision: 26616

we are observing the following strange issue (see below). How do you think,
is it a problem of NVCC or OpenMPI?

Thanks,
- Dima.

[dmikushin@tesla-apc mpitest]$ cat mpitest.cu
#include 

__global__ void kernel() { }

[dmikushin@tesla-apc mpitest]$ nvcc -I/opt/openmpi-trunk/include -c
mpitest.cu
/opt/openmpi-trunk/include/mpi.h(365): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(374): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(382): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(724): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(730): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(736): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(790): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(791): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1049): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1070): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1072): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1074): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1145): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1149): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1151): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1345): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1347): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1484): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1507): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1510): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1515): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1525): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1527): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1589): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1610): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1612): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1614): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1685): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1689): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1691): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1886): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(1888): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(2024): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(2047): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(2050): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(2055): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(2065): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/mpi.h(2067): error: attribute "__deprecated__"
does not take arguments

/opt/openmpi-trunk/include/openmpi/ompi/mpi/cxx/comm.h(102): error:
attribute "__deprecated__" does not take arguments

/opt/openmpi-trunk/include/openmpi/ompi/mpi/cxx/win.h(90): error: attribute
"__deprecated__" does not take arguments

/opt/openmpi-trunk/include/openmpi/ompi/mpi/cxx/file.h(298): error:
attribute "__deprecated__" does not take arguments

41 errors detected in the compilation of
"/tmp/tmpxft_4a17_-4_mpitest.cpp1.ii".

Re: [OMPI users] starting open-mpi

2012-05-11 Thread Dmitry N. Mikushin

Hi Ghobad,

The error message means the OpenMPI wants to use cl.exe - the compiler
from Microsoft Visual Studio.

Here http://www.open-mpi.org/software/ompi/v1.5/ms-windows.php is it stated:

This is the first binary release for Windows, with basic MPI libraries
and executables. The supported platforms are Windows XP, Windows
Vista, Windows Server 2003/2008, and Windows 7 (including both 32 and
64 bit versions). The installers were configured with CMake 2.8.1 and
compiled under Visual Studio 2010, and they support for C/C++
compilers of Visual Studio 2005, 2008 and 2010.

So, to compile MPI programs you probably need one of this compilers to
be installed.

Best regards.
- Dima.

2012/5/10 Ghobad Zarrinchian :
> Hi all. I'm a new open-mpi user. I've downloaded the
> OpenMPI_v1.5.5-1_win32.exe file to install open-mpi on my dual-core windows
> 7 machine. I installed the file but now i can't compile my mpi programs. I
> use command below (in command prompt window) to compile my 'test.cpp'
> program:
>
>>> mpic++ -o test test.cpp
>
> but i get error as follows:
>
>>> The open mpi wrapper compiler was unable to find the specified compiler
>>> cl.exe in your path.
>  Note that this compiler was either specified at configure time or in
> one several possible environment variables.
>
> What is the problem? Is my compilation command right? Is there any remained
> necessary steps to complete my open-mpi installation?
> Is it necessary to specify some environment variables?
>
> Thanks in advanced.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] possibly undefined macro: AC_PROG_LIBTOOL

2011-12-29 Thread Dmitry N. Mikushin

OK, apparently that were various backports. And now I have only one
autoconf installed, and it is:

marcusmae@teslatron:~/Programming/openmpi-r24785$ autoconf --version
autoconf (GNU Autoconf) 2.67

marcusmae@teslatron:~/Programming/openmpi-r24785$ autoreconf --version
autoreconf (GNU Autoconf) 2.67

However:

6. Processing autogen.subdirs directories

=== Processing subdir:
/home/marcusmae/Programming/openmpi-r24785/opal/mca/event/libevent207/libevent
--- Found autogen.sh; running...
autoreconf: Entering directory `.'
autoreconf: configure.in: not using Gettext
autoreconf: running: aclocal --force -I m4
autoreconf: configure.in: tracing
autoreconf: configure.in: not using Libtool
autoreconf: running: /usr/bin/autoconf --force
configure.in:113: error: possibly undefined macro: AC_PROG_LIBTOOL
  If this token and others are legitimate, please use m4_pattern_allow.
  See the Autoconf documentation.
autoreconf: /usr/bin/autoconf failed with exit status: 1
Command failed: ./autogen.sh

Does it work for you with 2.67?

Thanks,
- D.

2011/12/30 Ralph Castain :
>
> On Dec 29, 2011, at 3:39 PM, Dmitry N. Mikushin wrote:
>
>> No, that was autoREconf, and all they are below 2.65:
>>
>> marcusmae@teslatron:~/Programming/openmpi-r24785$ ls /usr/bin/autoreconf
>> autoreconf      autoreconf2.13  autoreconf2.50  autoreconf2.59  
>> autoreconf2.64
>>
>> And default one points to 2.50:
>>
>> marcusmae@teslatron:~/Programming/openmpi-r24785$ autoreconf -help
>> Usage: /usr/bin/autoreconf2.50 [OPTION]... [DIRECTORY]...
>>
>> I don't know why, probably that's the default Debian Squeeze setup?
>
> Probably - but that's no good. It should be the same level as autoconf as the 
> two are packaged together to avoid incompatibilities like you are hitting 
> here. Did you install autoconf yourself? If so, can you point autoreconf to 
> the corresponding binary?
>
>>
>> - D.
>>
>> 2011/12/30 Ralph Castain :
>>> Strange - if you look at your original output, autoconf is identified as 
>>> 2.50 - a version that is way too old for us. However, what you just sent 
>>> now shows 2.67, which would be fine.
>>>
>>> Why the difference?
>>>
>>>
>>> On Dec 29, 2011, at 3:27 PM, Dmitry N. Mikushin wrote:
>>>
>>>> Hi Ralph,
>>>>
>>>> URL: http://svn.open-mpi.org/svn/ompi/trunk
>>>> Repository Root: http://svn.open-mpi.org/svn/ompi
>>>> Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe
>>>> Revision: 24785
>>>> Node Kind: directory
>>>> Schedule: normal
>>>> Last Changed Author: rhc
>>>> Last Changed Rev: 24785
>>>> Last Changed Date: 2011-06-17 22:01:23 +0400 (Fri, 17 Jun 2011)
>>>>
>>>> 1. Checking tool versions
>>>>
>>>>   Searching for autoconf
>>>>     Found autoconf version 2.67; checking version...
>>>>       Found version component 2 -- need 2
>>>>       Found version component 67 -- need 65
>>>>     ==> ACCEPTED
>>>>   Searching for libtoolize
>>>>     Found libtoolize version 2.2.6b; checking version...
>>>>       Found version component 2 -- need 2
>>>>       Found version component 2 -- need 2
>>>>       Found version component 6b -- need 6b
>>>>     ==> ACCEPTED
>>>>   Searching for automake
>>>>     Found automake version 1.11.1; checking version...
>>>>       Found version component 1 -- need 1
>>>>       Found version component 11 -- need 11
>>>>       Found version component 1 -- need 1
>>>>     ==> ACCEPTED
>>>>
>>>> 2011/12/30 Ralph Castain :
>>>>> Are you doing this on a subversion checkout? Of which branch?
>>>>>
>>>>> Did you check your autotoll versions to ensure you meet the minimum 
>>>>> required levels? The requirements differ by version.
>>>>>
>>>>> On Dec 29, 2011, at 2:52 PM, Dmitry N. Mikushin wrote:
>>>>>
>>>>>> Dear Open MPI Community,
>>>>>>
>>>>>> I need a custom OpenMPI build. While running ./autogen.pl on Debian
>>>>>> Squeeze, there is an error:
>>>>>>
>>>>>> --- Found autogen.sh; running...
>>>>>> autoreconf2.50: Entering directory `.'
>>>>>> autoreconf2.50: configure.in: not using Gettext
>>>>>> autoreconf2.50: running: aclocal --force -I m4
>>>>>> autoreconf2.50: configure.in: tracing
&g

Re: [OMPI users] possibly undefined macro: AC_PROG_LIBTOOL

2011-12-29 Thread Dmitry N. Mikushin

No, that was autoREconf, and all they are below 2.65:

marcusmae@teslatron:~/Programming/openmpi-r24785$ ls /usr/bin/autoreconf
autoreconf  autoreconf2.13  autoreconf2.50  autoreconf2.59  autoreconf2.64

And default one points to 2.50:

marcusmae@teslatron:~/Programming/openmpi-r24785$ autoreconf -help
Usage: /usr/bin/autoreconf2.50 [OPTION]... [DIRECTORY]...

I don't know why, probably that's the default Debian Squeeze setup?

- D.

2011/12/30 Ralph Castain :
> Strange - if you look at your original output, autoconf is identified as 2.50 
> - a version that is way too old for us. However, what you just sent now shows 
> 2.67, which would be fine.
>
> Why the difference?
>
>
> On Dec 29, 2011, at 3:27 PM, Dmitry N. Mikushin wrote:
>
>> Hi Ralph,
>>
>> URL: http://svn.open-mpi.org/svn/ompi/trunk
>> Repository Root: http://svn.open-mpi.org/svn/ompi
>> Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe
>> Revision: 24785
>> Node Kind: directory
>> Schedule: normal
>> Last Changed Author: rhc
>> Last Changed Rev: 24785
>> Last Changed Date: 2011-06-17 22:01:23 +0400 (Fri, 17 Jun 2011)
>>
>> 1. Checking tool versions
>>
>>   Searching for autoconf
>>     Found autoconf version 2.67; checking version...
>>       Found version component 2 -- need 2
>>       Found version component 67 -- need 65
>>     ==> ACCEPTED
>>   Searching for libtoolize
>>     Found libtoolize version 2.2.6b; checking version...
>>       Found version component 2 -- need 2
>>       Found version component 2 -- need 2
>>       Found version component 6b -- need 6b
>>     ==> ACCEPTED
>>   Searching for automake
>>     Found automake version 1.11.1; checking version...
>>       Found version component 1 -- need 1
>>       Found version component 11 -- need 11
>>       Found version component 1 -- need 1
>>     ==> ACCEPTED
>>
>> 2011/12/30 Ralph Castain :
>>> Are you doing this on a subversion checkout? Of which branch?
>>>
>>> Did you check your autotoll versions to ensure you meet the minimum 
>>> required levels? The requirements differ by version.
>>>
>>> On Dec 29, 2011, at 2:52 PM, Dmitry N. Mikushin wrote:
>>>
>>>> Dear Open MPI Community,
>>>>
>>>> I need a custom OpenMPI build. While running ./autogen.pl on Debian
>>>> Squeeze, there is an error:
>>>>
>>>> --- Found autogen.sh; running...
>>>> autoreconf2.50: Entering directory `.'
>>>> autoreconf2.50: configure.in: not using Gettext
>>>> autoreconf2.50: running: aclocal --force -I m4
>>>> autoreconf2.50: configure.in: tracing
>>>> autoreconf2.50: configure.in: not using Libtool
>>>> autoreconf2.50: running: /usr/bin/autoconf --force
>>>> configure.in:113: error: possibly undefined macro: AC_PROG_LIBTOOL
>>>>      If this token and others are legitimate, please use m4_pattern_allow.
>>>>      See the Autoconf documentation.
>>>> autoreconf2.50: /usr/bin/autoconf failed with exit status: 1
>>>> Command failed: ./autogen.sh
>>>>
>>>> It's a bit confusing, because automake, libtool, autoconf are
>>>> installed. What might be the other reasons of this error?
>>>>
>>>> Thanks,
>>>> - Dima.
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] possibly undefined macro: AC_PROG_LIBTOOL

2011-12-29 Thread Dmitry N. Mikushin

Hi Ralph,

URL: http://svn.open-mpi.org/svn/ompi/trunk
Repository Root: http://svn.open-mpi.org/svn/ompi
Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe
Revision: 24785
Node Kind: directory
Schedule: normal
Last Changed Author: rhc
Last Changed Rev: 24785
Last Changed Date: 2011-06-17 22:01:23 +0400 (Fri, 17 Jun 2011)

1. Checking tool versions

   Searching for autoconf
 Found autoconf version 2.67; checking version...
   Found version component 2 -- need 2
   Found version component 67 -- need 65
 ==> ACCEPTED
   Searching for libtoolize
 Found libtoolize version 2.2.6b; checking version...
   Found version component 2 -- need 2
   Found version component 2 -- need 2
   Found version component 6b -- need 6b
 ==> ACCEPTED
   Searching for automake
 Found automake version 1.11.1; checking version...
   Found version component 1 -- need 1
   Found version component 11 -- need 11
   Found version component 1 -- need 1
 ==> ACCEPTED

2011/12/30 Ralph Castain :
> Are you doing this on a subversion checkout? Of which branch?
>
> Did you check your autotoll versions to ensure you meet the minimum required 
> levels? The requirements differ by version.
>
> On Dec 29, 2011, at 2:52 PM, Dmitry N. Mikushin wrote:
>
>> Dear Open MPI Community,
>>
>> I need a custom OpenMPI build. While running ./autogen.pl on Debian
>> Squeeze, there is an error:
>>
>> --- Found autogen.sh; running...
>> autoreconf2.50: Entering directory `.'
>> autoreconf2.50: configure.in: not using Gettext
>> autoreconf2.50: running: aclocal --force -I m4
>> autoreconf2.50: configure.in: tracing
>> autoreconf2.50: configure.in: not using Libtool
>> autoreconf2.50: running: /usr/bin/autoconf --force
>> configure.in:113: error: possibly undefined macro: AC_PROG_LIBTOOL
>>      If this token and others are legitimate, please use m4_pattern_allow.
>>      See the Autoconf documentation.
>> autoreconf2.50: /usr/bin/autoconf failed with exit status: 1
>> Command failed: ./autogen.sh
>>
>> It's a bit confusing, because automake, libtool, autoconf are
>> installed. What might be the other reasons of this error?
>>
>> Thanks,
>> - Dima.
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

[OMPI users] possibly undefined macro: AC_PROG_LIBTOOL

2011-12-29 Thread Dmitry N. Mikushin

Dear Open MPI Community,

I need a custom OpenMPI build. While running ./autogen.pl on Debian
Squeeze, there is an error:

--- Found autogen.sh; running...
autoreconf2.50: Entering directory `.'
autoreconf2.50: configure.in: not using Gettext
autoreconf2.50: running: aclocal --force -I m4
autoreconf2.50: configure.in: tracing
autoreconf2.50: configure.in: not using Libtool
autoreconf2.50: running: /usr/bin/autoconf --force
configure.in:113: error: possibly undefined macro: AC_PROG_LIBTOOL
  If this token and others are legitimate, please use m4_pattern_allow.
  See the Autoconf documentation.
autoreconf2.50: /usr/bin/autoconf failed with exit status: 1
Command failed: ./autogen.sh

It's a bit confusing, because automake, libtool, autoconf are
installed. What might be the other reasons of this error?

Thanks,
- Dima.

Re: [OMPI users] How "CUDA Init prior to MPI_Init" co-exists with unique GPU for each MPI process?

2011-12-14 Thread Dmitry N. Mikushin

Dear Matthieu, Rolf,

Thank you!

But normally CUDA device selection is based on MPI process index. So,
cuda context must exist where MPI index is not yet available. What is
the best practice of process<->GPU mapping in this case? Or can I
select any device prior to MPI_Init and later change to another
device?

- D.

2011/12/14 Rolf vandeVaart :
> To add to this, yes, we recommend that the CUDA context exists prior to a
> call to MPI_Init.  That is because a CUDA context needs to exist prior to
> MPI_Init as the library attempts to register some internal buffers with the
> CUDA library that require a CUDA context exists already.  Note that this is
> only relevant if you plan to send and receive CUDA device memory directly
> from MPI calls.   There is a little more about this in the FAQ here.
>
>
>
> http://www.open-mpi.org/faq/?category=running#mpi-cuda-support
>
>
>
>
>
> Rolf
>
>
>
> From: Matthieu Brucher [mailto:matthieu.bruc...@gmail.com]
> Sent: Wednesday, December 14, 2011 10:47 AM
> To: Open MPI Users
> Cc: Rolf vandeVaart
> Subject: Re: [OMPI users] How "CUDA Init prior to MPI_Init" co-exists with
> unique GPU for each MPI process?
>
>
>
> Hi,
>
>
>
> Processes are not spawned by MPI_Init. They are spawned before by some
> applications between your mpirun call and when your program starts. When it
> does, you already have all MPI processes (you can check by adding a sleep or
> something like that), but they are not synchronized and do not know each
> other. This is what MPI_Init is used for.
>
>
>
> Matthieu Brucher
>
> 2011/12/14 Dmitry N. Mikushin 
>
> Dear colleagues,
>
> For GPU Winter School powered by Moscow State University cluster
> "Lomonosov", the OpenMPI 1.7 was built to test and popularize CUDA
> capabilities of MPI. There is one strange warning I cannot understand:
> OpenMPI runtime suggests to initialize CUDA prior to MPI_Init. Sorry,
> but how could it be? I thought processes are spawned during MPI_Init,
> and such context will be created on the very first root process. Why
> do we need existing CUDA context before MPI_Init? I think there was no
> such error in previous versions.
>
> Thanks,
> - D.
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
>
>
> --
> Information System Engineer, Ph.D.
> Blog: http://matt.eifelle.com
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>
> 
> This email message is for the sole use of the intended recipient(s) and may
> contain confidential information.  Any unauthorized review, use, disclosure
> or distribution is prohibited.  If you are not the intended recipient,
> please contact the sender by reply email and destroy all copies of the
> original message.
> 
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

[OMPI users] How "CUDA Init prior to MPI_Init" co-exists with unique GPU for each MPI process?

2011-12-14 Thread Dmitry N. Mikushin

Dear colleagues,

For GPU Winter School powered by Moscow State University cluster
"Lomonosov", the OpenMPI 1.7 was built to test and popularize CUDA
capabilities of MPI. There is one strange warning I cannot understand:
OpenMPI runtime suggests to initialize CUDA prior to MPI_Init. Sorry,
but how could it be? I thought processes are spawned during MPI_Init,
and such context will be created on the very first root process. Why
do we need existing CUDA context before MPI_Init? I think there was no
such error in previous versions.

Thanks,
- D.

Re: [OMPI users] configure with cuda

2011-10-27 Thread Dmitry N. Mikushin

> CUDA is an Nvidia-only technology, so it might be a bit limiting in some 
> cases.

I think here it's more a question of compatibility (that is ~ 1.0 /
[magnitude of effort]), rather than corporate selfishness >:) Consider
memory buffers implementation - counter to CUDA in OpenCL they are
some abstract containers, not plain pointers (cl_mem). So, to combine
OpenCL with MPI, one first need to propose and adopt suitable API
design solution. At least this one is not an easy task, IMO.

- D.

2011/10/27 Durga Choudhury :
> Is there any provision/future plans to add OpenCL support as well?
> CUDA is an Nvidia-only technology, so it might be a bit limiting in
> some cases.
>
> Best regards
> Durga
>
>
> On Thu, Oct 27, 2011 at 2:45 PM, Rolf vandeVaart  
> wrote:
>> Actually, that is not quite right.  From the FAQ:
>>
>>
>>
>> “This feature currently only exists in the trunk version of the Open MPI
>> library.”
>>
>>
>>
>> You need to download and use the trunk version for this to work.
>>
>>
>>
>> http://www.open-mpi.org/nightly/trunk/
>>
>>
>>
>> Rolf
>>
>>
>>
>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
>> Behalf Of Ralph Castain
>> Sent: Thursday, October 27, 2011 11:43 AM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] configure with cuda
>>
>>
>>
>>
>>
>> I'm pretty sure cuda support was never moved to the 1.4 series. You will,
>> however, find it in the 1.5 series. I suggest you get the latest tarball
>> from there.
>>
>>
>>
>>
>>
>> On Oct 27, 2011, at 12:38 PM, Peter Wells wrote:
>>
>>
>>
>> I am attempting to configure OpenMPI 1.4.3 with cuda support on a Redhat 5
>> box. When I try to run configure with the following command:
>>
>>
>>
>>  ./configure
>> --prefix=/opt/crc/sandbox/pwells2/openmpi/1.4.3/intel-12.0-cuda/ FC=ifort
>> F77=ifort CXX=icpc CC=icc --with-sge --disable-dlopen --enable-static
>> --enable-shared --disable-openib-connectx-xrc --disable-openib-rdmacm
>> --without-openib --with-cuda=/opt/crc/cuda/4.0/cuda
>> --with-cuda-libdir=/opt/crc/cuda/4.0/cuda/lib64
>>
>>
>>
>> I receive the warning that '--with-cuda' and '--with-cuda-libdir' are
>> unrecognized options. According to the FAQ these options are supported in
>> this version of OpenMPI. I attempted the same thing with v.1.4.4 downloaded
>> directly from open-mpi.org with similar results. Attached are the results of
>> configure and make on v.1.4.3. Any help would be greatly appreciated.
>>
>>
>>
>> Peter Wells
>> HPC Intern
>> Center for Research Computing
>> University of Notre Dame
>> pwel...@nd.edu
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> 
>> This email message is for the sole use of the intended recipient(s) and may
>> contain confidential information.  Any unauthorized review, use, disclosure
>> or distribution is prohibited.  If you are not the intended recipient,
>> please contact the sender by reply email and destroy all copies of the
>> original message.
>> 
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] OpenMPI with CPU of different speed.

2011-10-05 Thread Dmitry N. Mikushin

Hi,

Maybe Mickaël means load balancing could be achieved simply by
spawning various number of MPI processes, depending on how many cores
particular node has? This should be possible, but accuracy of such
balancing will be task-dependent due to other factors, like memory
operations and communications.

- D.

2011/10/5 Andreas Schäfer :
> I'm afraid you'll have to do this kind of load balancing in your
> application itself as Open MPI (just like any other MPI implementation)
> has no notion of how your application manages its workload.
>
> HTH
> -Andreas
>
>
> On 14:05 Wed 05 Oct     , Mickaël CANÉVET wrote:
>> Hi,
>>
>> Is there a way to define a weight to the CPUs of the hosts. I have a
>> cluster made of machine from different generation and when I run a
>> process on it, the whole cluster is slowed down by the slowest node.
>>
>> What I'd like to do is something like that in my hostfile:
>>
>> oldest slots=4 weight=0.75
>> newer slots=8 weight=0.95
>> newest slots=12 weight=1
>>
>> So that CPUs of oldest (and slowest) machine gets less data to process.
>>
>> Thank you
>> Mickaël
>
>
>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> ==
> Andreas Schäfer
> HPC and Grid Computing
> Chair of Computer Science 3
> Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
> +49 9131 85-27910
> PGP/GPG key via keyserver
> http://www.libgeodecomp.org
> ==
>
> (\___/)
> (+'.'+)
> (")_(")
> This is Bunny. Copy and paste Bunny into your
> signature to help him gain world domination!
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] [SOLVED] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-10-03 Thread Dmitry N. Mikushin

Ok, here's the solution: remove --as-needed option out of compiler's
internal linker invocation command line. Steps to do this:

1) Dump compiler specs: $ gcc -dumpspecs > specs
2) Open specs file for edit and remove --as-needed from the line

*link:
%{!r:--build-id} --no-add-needed --as-needed %{!static:--eh-frame-hdr}
%{!m32:-m elf_x86_64} %{m32:-m elf_i386} --hash-style=gnu
%{shared:-shared}   %{!shared: %{!static:
%{rdynamic:-export-dynamic}   %{m32:-dynamic-linker
%{muclibc:/lib/ld-uClibc.so.0;:%{mbionic:/system/bin/linker;:/lib/ld-linux.so.2}}}
  %{!m32:-dynamic-linker
%{muclibc:/lib/ld64-uClibc.so.0;:%{mbionic:/system/bin/linker64;:/lib64/ld-linux-x86-64.so.2
%{static:-static}}

resulting into

*link:
%{!r:--build-id} --no-add-needed %{!static:--eh-frame-hdr} %{!m32:-m
elf_x86_64} %{m32:-m elf_i386} --hash-style=gnu   %{shared:-shared}
%{!shared: %{!static:   %{rdynamic:-export-dynamic}
%{m32:-dynamic-linker
%{muclibc:/lib/ld-uClibc.so.0;:%{mbionic:/system/bin/linker;:/lib/ld-linux.so.2}}}
  %{!m32:-dynamic-linker
%{muclibc:/lib/ld64-uClibc.so.0;:%{mbionic:/system/bin/linker64;:/lib64/ld-linux-x86-64.so.2
%{static:-static}}

3) Save specs file into compiler's folder
/usr/lib/gcc/// For example, in case of Ubuntu 10.10
with gcc 4.6.1 it's /usr/lib/gcc/x86_64-linux-gnu/4.6.1/

With this change no unresolvable relocations anymore!

- D.

2011/10/3 Dmitry N. Mikushin :
> Hi,
>
> Here's a reprocase, the same one as mentioned here:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=608901
>
> marcusmae@loveland:~/Programming/mpitest$ cat mpitest.f90
> program main
> include 'mpif.h'
> integer ierr
> call mpi_init(ierr)
> end
>
> marcusmae@loveland:~/Programming/mpitest$ mpif90 -g mpitest.f90
> /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x542): unresolvable
> R_X86_64_64 relocation against symbol `mpi_fortran_argv_null_'
> /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x55c): unresolvable
> R_X86_64_64 relocation against symbol `mpi_fortran_argv_null_'
> /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x5d2): unresolvable
> R_X86_64_64 relocation against symbol `mpi_fortran_errcodes_ignore_'
> /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x5ec): unresolvable
> R_X86_64_64 relocation against symbol `mpi_fortran_errcodes_ignore_'
>
> Remove "-g", and the error will be gone.
>
> marcusmae@loveland:~/Programming/mpitest$ mpif90 --showme -g mpitest.f90
> gfortran -g mpitest.f90 -I/opt/openmpi_gcc-1.5.4/include -pthread
> -I/opt/openmpi_gcc-1.5.4/lib -L/opt/openmpi_gcc-1.5.4/lib -lmpi_f90
> -lmpi_f77 -lmpi -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
>
> marcusmae@loveland:~/Programming/mpitest$ mpif90 -v
> Using built-in specs.
> COLLECT_GCC=/usr/bin/gfortran
> COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6.1/lto-wrapper
> Target: x86_64-linux-gnu
> Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
> 4.6.1-9ubuntu3'
> --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs
> --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr
> --program-suffix=-4.6 --enable-shared --enable-linker-build-id
> --with-system-zlib --libexecdir=/usr/lib --without-included-gettext
> --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6
> --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
> --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin
> --enable-objc-gc --disable-werror --with-arch-32=i686
> --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu
> --host=x86_64-linux-gnu --target=x86_64-linux-gnu
> Thread model: posix
> gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3)
>
> 2011/9/28 Dmitry N. Mikushin :
>> Hi,
>>
>> Interestingly, the errors are gone after I removed "-g" from the app
>> compile options.
>>
>> I tested again on the fresh Ubuntu 11.10 install: both 1.4.3 and 1.5.4
>> compile fine, but with the same error.
>> Also I tried hard to find any 32-bit object or library and failed.
>> They all are 64-bit.
>>
>> - D.
>>
>> 2011/9/24 Jeff Squyres :
>>> Check the output from when you ran Open MPI's configure and "make all" -- 
>>> did it decide to build the F77 interface?
>>>
>>> Also check that gcc and gfortran output .o files of the same bitness / type.
>>>
>>>
>>> On Sep 24, 2011, at 8:07 AM, Dmitry N. Mikushin wrote:
>>>
>>>> Compile and link - yes, but it turns out there was some unnoticed
>>>> compilation error because
>>>>
>>>> ./hellompi: error while loading shared libraries: libmpi_f77.so.1:
>>>> cannot open shared object file: No such

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-10-03 Thread Dmitry N. Mikushin

Hi,

Here's a reprocase, the same one as mentioned here:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=608901

marcusmae@loveland:~/Programming/mpitest$ cat mpitest.f90
program main
include 'mpif.h'
integer ierr
call mpi_init(ierr)
end

marcusmae@loveland:~/Programming/mpitest$ mpif90 -g mpitest.f90
/usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x542): unresolvable
R_X86_64_64 relocation against symbol `mpi_fortran_argv_null_'
/usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x55c): unresolvable
R_X86_64_64 relocation against symbol `mpi_fortran_argv_null_'
/usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x5d2): unresolvable
R_X86_64_64 relocation against symbol `mpi_fortran_errcodes_ignore_'
/usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x5ec): unresolvable
R_X86_64_64 relocation against symbol `mpi_fortran_errcodes_ignore_'

Remove "-g", and the error will be gone.

marcusmae@loveland:~/Programming/mpitest$ mpif90 --showme -g mpitest.f90
gfortran -g mpitest.f90 -I/opt/openmpi_gcc-1.5.4/include -pthread
-I/opt/openmpi_gcc-1.5.4/lib -L/opt/openmpi_gcc-1.5.4/lib -lmpi_f90
-lmpi_f77 -lmpi -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl

marcusmae@loveland:~/Programming/mpitest$ mpif90 -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gfortran
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6.1/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
4.6.1-9ubuntu3'
--with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr
--program-suffix=-4.6 --enable-shared --enable-linker-build-id
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin
--enable-objc-gc --disable-werror --with-arch-32=i686
--with-tune=generic --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3)

2011/9/28 Dmitry N. Mikushin :
> Hi,
>
> Interestingly, the errors are gone after I removed "-g" from the app
> compile options.
>
> I tested again on the fresh Ubuntu 11.10 install: both 1.4.3 and 1.5.4
> compile fine, but with the same error.
> Also I tried hard to find any 32-bit object or library and failed.
> They all are 64-bit.
>
> - D.
>
> 2011/9/24 Jeff Squyres :
>> Check the output from when you ran Open MPI's configure and "make all" -- 
>> did it decide to build the F77 interface?
>>
>> Also check that gcc and gfortran output .o files of the same bitness / type.
>>
>>
>> On Sep 24, 2011, at 8:07 AM, Dmitry N. Mikushin wrote:
>>
>>> Compile and link - yes, but it turns out there was some unnoticed
>>> compilation error because
>>>
>>> ./hellompi: error while loading shared libraries: libmpi_f77.so.1:
>>> cannot open shared object file: No such file or directory
>>>
>>> and this library does not exist.
>>>
>>> Hm.
>>>
>>> 2011/9/24 Jeff Squyres :
>>>> Can you compile / link simple OMPI applications without this problem?
>>>>
>>>> On Sep 24, 2011, at 7:54 AM, Dmitry N. Mikushin wrote:
>>>>
>>>>> Hi Jeff,
>>>>>
>>>>> Today I've verified this application on the Feroda 15 x86_64, where
>>>>> I'm usually building OpenMPI from source using the same method.
>>>>> Result: no link errors there! So, the issue is likely ubuntu-specific.
>>>>>
>>>>> Target application is compiled linked with mpif90 pointing to
>>>>> /opt/openmpi_gcc-1.5.4/bin/mpif90 I built.
>>>>>
>>>>> Regarding architectures, everything in target folders and OpenMPI
>>>>> installation is
>>>>> ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically
>>>>> linked, not stripped
>>>>>
>>>>> - D.
>>>>>
>>>>> 2011/9/24 Jeff Squyres :
>>>>>> How does the target application compile / link itself?
>>>>>>
>>>>>> Try running "file" on the Open MPI libraries and/or your target 
>>>>>> application .o files to see what their bitness is, etc.
>>>>>>
>>>>>>
>>>>>> On Sep 22, 2011, at 3:15 PM, Dmitry N. Mikushin wrote:
>>>>>>
>>>>>>> Hi Jeff,
>>>>>>>
>>>>>>> You're right because I

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-28 Thread Dmitry N. Mikushin

Hi,

Interestingly, the errors are gone after I removed "-g" from the app
compile options.

I tested again on the fresh Ubuntu 11.10 install: both 1.4.3 and 1.5.4
compile fine, but with the same error.
Also I tried hard to find any 32-bit object or library and failed.
They all are 64-bit.

- D.

2011/9/24 Jeff Squyres :
> Check the output from when you ran Open MPI's configure and "make all" -- did 
> it decide to build the F77 interface?
>
> Also check that gcc and gfortran output .o files of the same bitness / type.
>
>
> On Sep 24, 2011, at 8:07 AM, Dmitry N. Mikushin wrote:
>
>> Compile and link - yes, but it turns out there was some unnoticed
>> compilation error because
>>
>> ./hellompi: error while loading shared libraries: libmpi_f77.so.1:
>> cannot open shared object file: No such file or directory
>>
>> and this library does not exist.
>>
>> Hm.
>>
>> 2011/9/24 Jeff Squyres :
>>> Can you compile / link simple OMPI applications without this problem?
>>>
>>> On Sep 24, 2011, at 7:54 AM, Dmitry N. Mikushin wrote:
>>>
>>>> Hi Jeff,
>>>>
>>>> Today I've verified this application on the Feroda 15 x86_64, where
>>>> I'm usually building OpenMPI from source using the same method.
>>>> Result: no link errors there! So, the issue is likely ubuntu-specific.
>>>>
>>>> Target application is compiled linked with mpif90 pointing to
>>>> /opt/openmpi_gcc-1.5.4/bin/mpif90 I built.
>>>>
>>>> Regarding architectures, everything in target folders and OpenMPI
>>>> installation is
>>>> ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically
>>>> linked, not stripped
>>>>
>>>> - D.
>>>>
>>>> 2011/9/24 Jeff Squyres :
>>>>> How does the target application compile / link itself?
>>>>>
>>>>> Try running "file" on the Open MPI libraries and/or your target 
>>>>> application .o files to see what their bitness is, etc.
>>>>>
>>>>>
>>>>> On Sep 22, 2011, at 3:15 PM, Dmitry N. Mikushin wrote:
>>>>>
>>>>>> Hi Jeff,
>>>>>>
>>>>>> You're right because I also tried 1.4.3, and it's the same issue
>>>>>> there. But what could be wrong? I'm using the simplest form -
>>>>>> ../configure --prefix=/opt/openmpi_gcc-1.4.3/ and only installed
>>>>>> compilers are system-default gcc and gfortran 4.6.1. Distro is ubuntu
>>>>>> 11.10. There is no any mpi installed from packages, and no -m32
>>>>>> options around. What else could be the source?
>>>>>>
>>>>>> Thanks,
>>>>>> - D.
>>>>>>
>>>>>> 2011/9/22 Jeff Squyres :
>>>>>>> This usually means that you're mixing compiler/linker flags somehow 
>>>>>>> (e.g., built something with 32 bit, built something else with 64 bit, 
>>>>>>> try to link them together).
>>>>>>>
>>>>>>> Can you verify that everything was built with all the same 32/64?
>>>>>>>
>>>>>>>
>>>>>>> On Sep 22, 2011, at 1:21 PM, Dmitry N. Mikushin wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives
>>>>>>>> a load of linker messages like this one:
>>>>>>>>
>>>>>>>> /usr/bin/ld: 
>>>>>>>> ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d):
>>>>>>>> unresolvable R_X86_64_64 relocation against symbol
>>>>>>>> `mpi_fortran_argv_null_
>>>>>>>>
>>>>>>>> There are a lot of similar messages about other mpi_fortran_ symbols.
>>>>>>>> Is it a known issue?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> - D.
>>>>>>>> ___
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jeff Sq

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-24 Thread Dmitry N. Mikushin

Compile and link - yes, but it turns out there was some unnoticed
compilation error because

./hellompi: error while loading shared libraries: libmpi_f77.so.1:
cannot open shared object file: No such file or directory

and this library does not exist.

Hm.

2011/9/24 Jeff Squyres :
> Can you compile / link simple OMPI applications without this problem?
>
> On Sep 24, 2011, at 7:54 AM, Dmitry N. Mikushin wrote:
>
>> Hi Jeff,
>>
>> Today I've verified this application on the Feroda 15 x86_64, where
>> I'm usually building OpenMPI from source using the same method.
>> Result: no link errors there! So, the issue is likely ubuntu-specific.
>>
>> Target application is compiled linked with mpif90 pointing to
>> /opt/openmpi_gcc-1.5.4/bin/mpif90 I built.
>>
>> Regarding architectures, everything in target folders and OpenMPI
>> installation is
>> ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically
>> linked, not stripped
>>
>> - D.
>>
>> 2011/9/24 Jeff Squyres :
>>> How does the target application compile / link itself?
>>>
>>> Try running "file" on the Open MPI libraries and/or your target application 
>>> .o files to see what their bitness is, etc.
>>>
>>>
>>> On Sep 22, 2011, at 3:15 PM, Dmitry N. Mikushin wrote:
>>>
>>>> Hi Jeff,
>>>>
>>>> You're right because I also tried 1.4.3, and it's the same issue
>>>> there. But what could be wrong? I'm using the simplest form -
>>>> ../configure --prefix=/opt/openmpi_gcc-1.4.3/ and only installed
>>>> compilers are system-default gcc and gfortran 4.6.1. Distro is ubuntu
>>>> 11.10. There is no any mpi installed from packages, and no -m32
>>>> options around. What else could be the source?
>>>>
>>>> Thanks,
>>>> - D.
>>>>
>>>> 2011/9/22 Jeff Squyres :
>>>>> This usually means that you're mixing compiler/linker flags somehow 
>>>>> (e.g., built something with 32 bit, built something else with 64 bit, try 
>>>>> to link them together).
>>>>>
>>>>> Can you verify that everything was built with all the same 32/64?
>>>>>
>>>>>
>>>>> On Sep 22, 2011, at 1:21 PM, Dmitry N. Mikushin wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives
>>>>>> a load of linker messages like this one:
>>>>>>
>>>>>> /usr/bin/ld: 
>>>>>> ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d):
>>>>>> unresolvable R_X86_64_64 relocation against symbol
>>>>>> `mpi_fortran_argv_null_
>>>>>>
>>>>>> There are a lot of similar messages about other mpi_fortran_ symbols.
>>>>>> Is it a known issue?
>>>>>>
>>>>>> Thanks,
>>>>>> - D.
>>>>>> ___
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>> --
>>>>> Jeff Squyres
>>>>> jsquy...@cisco.com
>>>>> For corporate legal information go to:
>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>
>>>>>
>>>>> ___
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-24 Thread Dmitry N. Mikushin

Hi Jeff,

Today I've verified this application on the Feroda 15 x86_64, where
I'm usually building OpenMPI from source using the same method.
Result: no link errors there! So, the issue is likely ubuntu-specific.

Target application is compiled linked with mpif90 pointing to
/opt/openmpi_gcc-1.5.4/bin/mpif90 I built.

Regarding architectures, everything in target folders and OpenMPI
installation is
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically
linked, not stripped

- D.

2011/9/24 Jeff Squyres :
> How does the target application compile / link itself?
>
> Try running "file" on the Open MPI libraries and/or your target application 
> .o files to see what their bitness is, etc.
>
>
> On Sep 22, 2011, at 3:15 PM, Dmitry N. Mikushin wrote:
>
>> Hi Jeff,
>>
>> You're right because I also tried 1.4.3, and it's the same issue
>> there. But what could be wrong? I'm using the simplest form -
>> ../configure --prefix=/opt/openmpi_gcc-1.4.3/ and only installed
>> compilers are system-default gcc and gfortran 4.6.1. Distro is ubuntu
>> 11.10. There is no any mpi installed from packages, and no -m32
>> options around. What else could be the source?
>>
>> Thanks,
>> - D.
>>
>> 2011/9/22 Jeff Squyres :
>>> This usually means that you're mixing compiler/linker flags somehow (e.g., 
>>> built something with 32 bit, built something else with 64 bit, try to link 
>>> them together).
>>>
>>> Can you verify that everything was built with all the same 32/64?
>>>
>>>
>>> On Sep 22, 2011, at 1:21 PM, Dmitry N. Mikushin wrote:
>>>
>>>> Hi,
>>>>
>>>> OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives
>>>> a load of linker messages like this one:
>>>>
>>>> /usr/bin/ld: ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d):
>>>> unresolvable R_X86_64_64 relocation against symbol
>>>> `mpi_fortran_argv_null_
>>>>
>>>> There are a lot of similar messages about other mpi_fortran_ symbols.
>>>> Is it a known issue?
>>>>
>>>> Thanks,
>>>> - D.
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-22 Thread Dmitry N. Mikushin

Hi Jeff,

You're right because I also tried 1.4.3, and it's the same issue
there. But what could be wrong? I'm using the simplest form -
../configure --prefix=/opt/openmpi_gcc-1.4.3/ and only installed
compilers are system-default gcc and gfortran 4.6.1. Distro is ubuntu
11.10. There is no any mpi installed from packages, and no -m32
options around. What else could be the source?

Thanks,
- D.

2011/9/22 Jeff Squyres :
> This usually means that you're mixing compiler/linker flags somehow (e.g., 
> built something with 32 bit, built something else with 64 bit, try to link 
> them together).
>
> Can you verify that everything was built with all the same 32/64?
>
>
> On Sep 22, 2011, at 1:21 PM, Dmitry N. Mikushin wrote:
>
>> Hi,
>>
>> OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives
>> a load of linker messages like this one:
>>
>> /usr/bin/ld: ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d):
>> unresolvable R_X86_64_64 relocation against symbol
>> `mpi_fortran_argv_null_
>>
>> There are a lot of similar messages about other mpi_fortran_ symbols.
>> Is it a known issue?
>>
>> Thanks,
>> - D.
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-22 Thread Dmitry N. Mikushin

Same error when configured with --with-pic --with-gnu-ld

2011/9/22 Dmitry N. Mikushin :
> Hi,
>
> OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives
> a load of linker messages like this one:
>
> /usr/bin/ld: ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d):
> unresolvable R_X86_64_64 relocation against symbol
> `mpi_fortran_argv_null_
>
> There are a lot of similar messages about other mpi_fortran_ symbols.
> Is it a known issue?
>
> Thanks,
> - D.
>

[OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-22 Thread Dmitry N. Mikushin

Hi,

OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives
a load of linker messages like this one:

/usr/bin/ld: ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d):
unresolvable R_X86_64_64 relocation against symbol
`mpi_fortran_argv_null_

There are a lot of similar messages about other mpi_fortran_ symbols.
Is it a known issue?

Thanks,
- D.

Re: [OMPI users] Compiling both 32-bit and 64-bit?

2011-08-24 Thread Dmitry N. Mikushin

Thanks, Brian,

I'm trying to follow the guide for 1.5.4, not yet clear what's wrong:

[marcusmae@zacate build32]$ ../configure
--prefix=/opt/openmpi_kgen-1.5.4
--includedir=/opt/openmpi_kgen-1.5.4/include/32
--libdir=/opt/openmpi_kgen-1.5.4/lib32 --build=x86_64-unknown-linux
--host=x86_64-unknown-linux --target=i686-unknown-linux
--disable-binaries

...

configure: WARNING: *** The Open MPI configure script does not support
--program-prefix, --program-suffix or --program-transform-name. Users
are recommended to instead use --prefix with a unique directory and
make symbolic links as desired for renaming.
configure: error: *** Cannot continue

[marcusmae@zacate build32]$ ../configure
--prefix=/opt/openmpi_kgen-1.5.4
--includedir=/opt/openmpi_kgen-1.5.4/include/32
--libdir=/opt/openmpi_kgen-1.5.4/lib32 --build=x86_64-unknown-linux
--host=i686-unknown-linux --disable-binaries

...

checking gfortran external symbol convention... link: invalid option -- 'd'
Try `link --help' for more information.
link: invalid option -- 'd'
Try `link --help' for more information.
link: invalid option -- 'd'
Try `link --help' for more information.
link: invalid option -- 'd'
Try `link --help' for more information.
link: invalid option -- 'd'
Try `link --help' for more information.

configure: error: unknown naming convention:

2011/8/24 Barrett, Brian W :
> On 8/24/11 11:29 AM, "Dmitry N. Mikushin"  wrote:
>
>>Quick question: is there an easy switch to compile and install both
>>32-bit and 64-bit OpenMPI libraries into a single tree? E.g. 64-bit in
>>/prefix/lib64 and 32-bit in /prefix/lib.
>
> Quick answer: not easily.
>
> Long answer: There's not an easy way, but there are some facilities to
> help.  I believe Oracle uses them when building binaries for Solaris.
> There is some documentation available on our Trac wiki:
>
>  https://svn.open-mpi.org/trac/ompi/wiki/MultiLib
>  https://svn.open-mpi.org/trac/ompi/wiki/compilerwrapper3264
>
> The difficulty is that it's up to the user/admin to make sure the correct
> arguments are provided, as well as writing the wrapper script files to do
> the sharing.
>
> Brian
>
> --
>  Brian W. Barrett
>  Dept. 1423: Scalable System Software
>  Sandia National Laboratories
>
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

[OMPI users] Compiling both 32-bit and 64-bit?

2011-08-24 Thread Dmitry N. Mikushin

Hi,

Quick question: is there an easy switch to compile and install both
32-bit and 64-bit OpenMPI libraries into a single tree? E.g. 64-bit in
/prefix/lib64 and 32-bit in /prefix/lib.

Thanks,
- D.

Re: [OMPI users] OpenMPI causing WRF to crash

2011-08-03 Thread Dmitry N. Mikushin

BasitAli,

Signal 15 apparently means one of the WRF's MPI processes has been
unexpectedly terminated, maybe by program decision. No matter, if it
is OpenMPI-specific or not, issue needs to be tracked somehow to get
more details about it. Ideally, best thing is to get debugger attached
once the process signaled, then you can see call trace and figure out
what exactly has happened. This can be done by registering a custom
signal handler (see unix documentation for signals) or by running MPI
processes inside external diagnostic tool, for example valgrind:

mpirun -np  valgrind --db-attach=yes ./appname

... or by consulting with WRF community to check if they already have
configured some other approach.

Good luck with resolving this case!
- D.

2011/8/3 BasitAli  Khan :
> I am trying to run a rather heavy wrf simulation with spectral nudging but
> the simulation crashes after 1.8 minutes of integration.
>  The simulation has two domains    with  d01 = 601x601 and d02 = 721x721 and
> 51 vertical levels. I tried this simulation on two different systems but
> result was more or less same. For example
> On our Bluegene/P  with SUSE Linux Enterprise Server 10 ppc and XLF
> compiler I tried to run wrf on 2048 shared memory nodes (1 compute node = 4
> cores , 32 bit, 850 Mhz). For the parallel run I used mpixlc, mpixlcxx and
> mpixlf90.  I got the following error message in the wrf.err file
>  BE_MPI (ERROR): The error message in the job
> record is as follows:
>  BE_MPI (ERROR):   "killed with signal 15"
> I also tried to run the same simulation on our linux cluster (Linux Red Hat
> Enterprise 5.4m  x86_64 and Intel compiler) with 8, 16 and 64 nodes (1
> compute node=8 cores). For the parallel run I am
> used mpi/openmpi/1.4.2-intel-11. I got the following error message in the
> error log after couple of minutes of integration.
> "mpirun has exited due to process rank 45 with PID 19540 on
> node ci118 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here)."
> I tried many things but nothing seems to be working. However, if I reduce
>  grid points below 200, the simulation goes fine. It appears that probably
> OpenMP has problem with large number of grid points but I have no idea how
> to fix it. I will greatly appreciate if you could suggest some solution.
> Best regards,
> ---
> Basit A. Khan, Ph.D.
> Postdoctoral Fellow
> Division of Physical Sciences & Engineering
> Office# 3204, Level 3, Building 1,
> King Abdullah University of Science & Technology
> 4700 King Abdullah Blvd, Box 2753, Thuwal 23955 –6900,
> Kingdom of Saudi Arabia.
> Office: +966(0)2 808 0276,  Mobile: +966(0)5 9538 7592
> E-mail: basitali.k...@kaust.edu.sa
> Skype name: basit.a.khan
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Error installing OpenMPI 1.5.3

2011-07-10 Thread Dmitry N. Mikushin

Sorry, disregard this, the issue was created by my own buggy compiler wrapper.

- D.

2011/7/10 Dmitry N. Mikushin :
> Hi,
>
> Maybe it would be useful to report the openmpi 1.5.3 archive currently
> has a strange issue when installing on Fedora 15 x86_64 (gcc 4.6),
> that *does not* happen with 1.4.3:
>
> $ ../configure --prefix=/opt/openmpi_kgen-1.5.3 CC=gcc CXX=g++
> F77=gfortran FC=gfortran
>
> ...
>
> $ sudo make install
>
> ...
>
> make[5]: Entering directory
> `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90'
> test -z "/opt/openmpi_kgen-1.5.3/lib" || /bin/mkdir -p
> "/opt/openmpi_kgen-1.5.3/lib"
>  /bin/sh ../../../libtool   --mode=install /usr/bin/install -c
> libmpi_f90.la '/opt/openmpi_kgen-1.5.3/lib'
> libtool: install: warning: relinking `libmpi_f90.la'
> libtool: install: (cd
> /home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90; /bin/sh
> /home/marcusmae/Programming/openmpi-1.5.3/build/libtool  --silent
> --tag FC --mode=relink /usr/bin/gfortran -I../../../ompi/include
> -I../../../../ompi/include -I. -I../../../../ompi/mpi/f90
> -I../../../ompi/mpi/f90 -version-info 1:1:0 -export-dynamic -o
> libmpi_f90.la -rpath /opt/openmpi_kgen-1.5.3/lib mpi.lo mpi_sizeof.lo
> mpi_comm_spawn_multiple_f90.lo mpi_testall_f90.lo mpi_testsome_f90.lo
> mpi_waitall_f90.lo mpi_waitsome_f90.lo mpi_wtick_f90.lo
> mpi_wtime_f90.lo ../../../ompi/mpi/f77/libmpi_f77.la -lnsl -lutil -lm
> )
> mv: cannot stat `libmpi_f90.so.1.0.1': No such file or directory
> libtool: install: error: relink `libmpi_f90.la' with the above command
> before installing it
> make[5]: *** [install-libLTLIBRARIES] Error 1
> make[5]: Leaving directory
> `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90'
> make[4]: *** [install-am] Error 2
> make[4]: Leaving directory
> `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90'
> make[3]: *** [install-recursive] Error 1
> make[3]: Leaving directory
> `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90'
> make[2]: *** [install] Error 2
> make[2]: Leaving directory
> `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90'
> make[1]: *** [install-recursive] Error 1
> make[1]: Leaving directory
> `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi'
> make: *** [install-recursive] Error 1
>
> Is it a known problem?
>
> Thanks,
> - D.
>

[OMPI users] Error installing OpenMPI 1.5.3

2011-07-10 Thread Dmitry N. Mikushin

Hi,

Maybe it would be useful to report the openmpi 1.5.3 archive currently
has a strange issue when installing on Fedora 15 x86_64 (gcc 4.6),
that *does not* happen with 1.4.3:

$ ../configure --prefix=/opt/openmpi_kgen-1.5.3 CC=gcc CXX=g++
F77=gfortran FC=gfortran

...

$ sudo make install

...

make[5]: Entering directory
`/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90'
test -z "/opt/openmpi_kgen-1.5.3/lib" || /bin/mkdir -p
"/opt/openmpi_kgen-1.5.3/lib"
 /bin/sh ../../../libtool   --mode=install /usr/bin/install -c
libmpi_f90.la '/opt/openmpi_kgen-1.5.3/lib'
libtool: install: warning: relinking `libmpi_f90.la'
libtool: install: (cd
/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90; /bin/sh
/home/marcusmae/Programming/openmpi-1.5.3/build/libtool  --silent
--tag FC --mode=relink /usr/bin/gfortran -I../../../ompi/include
-I../../../../ompi/include -I. -I../../../../ompi/mpi/f90
-I../../../ompi/mpi/f90 -version-info 1:1:0 -export-dynamic -o
libmpi_f90.la -rpath /opt/openmpi_kgen-1.5.3/lib mpi.lo mpi_sizeof.lo
mpi_comm_spawn_multiple_f90.lo mpi_testall_f90.lo mpi_testsome_f90.lo
mpi_waitall_f90.lo mpi_waitsome_f90.lo mpi_wtick_f90.lo
mpi_wtime_f90.lo ../../../ompi/mpi/f77/libmpi_f77.la -lnsl -lutil -lm
)
mv: cannot stat `libmpi_f90.so.1.0.1': No such file or directory
libtool: install: error: relink `libmpi_f90.la' with the above command
before installing it
make[5]: *** [install-libLTLIBRARIES] Error 1
make[5]: Leaving directory
`/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90'
make[4]: *** [install-am] Error 2
make[4]: Leaving directory
`/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90'
make[3]: *** [install-recursive] Error 1
make[3]: Leaving directory
`/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90'
make[2]: *** [install] Error 2
make[2]: Leaving directory
`/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90'
make[1]: *** [install-recursive] Error 1
make[1]: Leaving directory
`/home/marcusmae/Programming/openmpi-1.5.3/build/ompi'
make: *** [install-recursive] Error 1

Is it a known problem?

Thanks,
- D.

Re: [OMPI users] mpif90 compiler non-functional

2011-06-22 Thread Dmitry N. Mikushin

Alexandre,

> How can I make to point to the new installed version in
> /opt/openmpi-1.4.3, when calling mpif90 or mpif77 as a common user ?

If you need to switch between multiple working MPI implementations
frequently (a common problem on public clusters or during local
testing/benchmarking), scripts like mpi-selector can be very handy.
First you register all possible variants with mpi-selector --register
 , and then you can switch current with
mpi-selector --set name (and restart shell). Technically, it does the
same thing already mentioned - adding records to $PATH and
LD_LIBRARY_PATH. Script is part of redhat distros (and is written by
Jeff, I suppose), but you can easily rebuild its source rpm for your
system or convert it with alien if you are on ubuntu (works for me).

- D.

2011/6/22 Alexandre Souza :
> Thanks Dimitri and Jeff for the output,
> I managed build the mpi and run the examples in f77 and f90 doing the 
> guideline.
> However the only problem is I was logged as Root.
> When I compile the examples with mpif90 or mpif77 as common user, it
> keeps pointing to the old installation of mpi that does not use the
> fortran compiler.
> (/home/amscosta/OpenFOAM/ThirdParty-1.7.x/platforms/linuxGcc/openmpi-1.4.1)
> How can I make to point to the new installed version in
> /opt/openmpi-1.4.3, when calling mpif90 or mpif77 as a common user ?
> Alex
>
> On Wed, Jun 22, 2011 at 1:49 PM, Jeff Squyres  wrote:
>> Dimitry is correct -- if OMPI's configure can find a working C++ and Fortran 
>> compiler, it'll build C++ / Fortran support.  Yours was not, indicating that:
>>
>> a) you got a binary distribution from someone who didn't include C++ / 
>> Fortran support, or
>>
>> b) when you built/installed Open MPI, it couldn't find a working C++ / 
>> Fortran compiler, so it skipped building support for them.
>>
>>
>>
>> On Jun 22, 2011, at 12:05 PM, Dmitry N. Mikushin wrote:
>>
>>> Here's mine produced from default compilation:
>>>
>>>                 Package: Open MPI marcusmae@T61p Distribution
>>>                Open MPI: 1.4.4rc2
>>>   Open MPI SVN revision: r24683
>>>   Open MPI release date: May 05, 2011
>>>                Open RTE: 1.4.4rc2
>>>   Open RTE SVN revision: r24683
>>>   Open RTE release date: May 05, 2011
>>>                    OPAL: 1.4.4rc2
>>>       OPAL SVN revision: r24683
>>>       OPAL release date: May 05, 2011
>>>            Ident string: 1.4.4rc2
>>>                  Prefix: /opt/openmpi_gcc-1.4.4
>>> Configured architecture: x86_64-unknown-linux-gnu
>>>          Configure host: T61p
>>>           Configured by: marcusmae
>>>           Configured on: Tue May 24 18:39:21 MSD 2011
>>>          Configure host: T61p
>>>                Built by: marcusmae
>>>                Built on: Tue May 24 18:46:52 MSD 2011
>>>              Built host: T61p
>>>              C bindings: yes
>>>            C++ bindings: yes
>>>      Fortran77 bindings: yes (all)
>>>      Fortran90 bindings: yes
>>> Fortran90 bindings size: small
>>>              C compiler: gcc
>>>     C compiler absolute: /usr/bin/gcc
>>>            C++ compiler: g++
>>>   C++ compiler absolute: /usr/bin/g++
>>>      Fortran77 compiler: gfortran
>>>  Fortran77 compiler abs: /usr/bin/gfortran
>>>      Fortran90 compiler: gfortran
>>>  Fortran90 compiler abs: /usr/bin/gfortran
>>>
>>> gfortran version is:
>>>
>>> gcc version 4.6.0 20110530 (Red Hat 4.6.0-9) (GCC)
>>>
>>> How do you run ./configure? Maybe try "./configure
>>> FC=/usr/bin/gfortran" ? It should really really work out of box
>>> though. Configure scripts usually cook some simple test apps and run
>>> them to check if compiler works properly. So, your ./configure output
>>> may help to understand more.
>>>
>>> - D.
>>>
>>> 2011/6/22 Alexandre Souza :
>>>> Hi Dimitri,
>>>> Thanks for the reply.
>>>> I have openmpi installed before for another application in :
>>>> /home/amscosta/OpenFOAM/ThirdParty-1.7.x/platforms/linuxGcc/openmpi-1.4.1
>>>> I installed a new version in /opt/openmpi-1.4.3.
>>>> I reproduce some output from the screen :
>>>> amscosta@amscosta-desktop:/opt/openmpi-1.4.3/bin$ ompi_info
>>>>                 Package: Open MPI amscosta@amscosta-desktop Distribution
>>>>                Open MPI: 1.4.1
>>>>   Open MPI SVN revision: r22421
>>>>

Re: [OMPI users] mpif90 compiler non-functional

2011-06-22 Thread Dmitry N. Mikushin

Here's mine produced from default compilation:

 Package: Open MPI marcusmae@T61p Distribution
Open MPI: 1.4.4rc2
   Open MPI SVN revision: r24683
   Open MPI release date: May 05, 2011
Open RTE: 1.4.4rc2
   Open RTE SVN revision: r24683
   Open RTE release date: May 05, 2011
OPAL: 1.4.4rc2
   OPAL SVN revision: r24683
   OPAL release date: May 05, 2011
Ident string: 1.4.4rc2
  Prefix: /opt/openmpi_gcc-1.4.4
 Configured architecture: x86_64-unknown-linux-gnu
  Configure host: T61p
   Configured by: marcusmae
   Configured on: Tue May 24 18:39:21 MSD 2011
  Configure host: T61p
Built by: marcusmae
Built on: Tue May 24 18:46:52 MSD 2011
  Built host: T61p
  C bindings: yes
C++ bindings: yes
  Fortran77 bindings: yes (all)
  Fortran90 bindings: yes
 Fortran90 bindings size: small
  C compiler: gcc
 C compiler absolute: /usr/bin/gcc
C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
  Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/bin/gfortran
  Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/bin/gfortran

gfortran version is:

gcc version 4.6.0 20110530 (Red Hat 4.6.0-9) (GCC)

How do you run ./configure? Maybe try "./configure
FC=/usr/bin/gfortran" ? It should really really work out of box
though. Configure scripts usually cook some simple test apps and run
them to check if compiler works properly. So, your ./configure output
may help to understand more.

- D.

2011/6/22 Alexandre Souza :
> Hi Dimitri,
> Thanks for the reply.
> I have openmpi installed before for another application in :
> /home/amscosta/OpenFOAM/ThirdParty-1.7.x/platforms/linuxGcc/openmpi-1.4.1
> I installed a new version in /opt/openmpi-1.4.3.
> I reproduce some output from the screen :
> amscosta@amscosta-desktop:/opt/openmpi-1.4.3/bin$ ompi_info
>                 Package: Open MPI amscosta@amscosta-desktop Distribution
>                Open MPI: 1.4.1
>   Open MPI SVN revision: r22421
>   Open MPI release date: Jan 14, 2010
>                Open RTE: 1.4.1
>   Open RTE SVN revision: r22421
>   Open RTE release date: Jan 14, 2010
>                    OPAL: 1.4.1
>       OPAL SVN revision: r22421
>       OPAL release date: Jan 14, 2010
>            Ident string: 1.4.1
>                  Prefix:
> /home/amscosta/OpenFOAM/ThirdParty-1.7.x/platforms/linuxGcc/openmpi-1.4.1
>  Configured architecture: i686-pc-linux-gnu
>          Configure host: amscosta-desktop
>           Configured by: amscosta
>           Configured on: Wed May 18 11:10:14 BRT 2011
>          Configure host: amscosta-desktop
>                Built by: amscosta
>                Built on: Wed May 18 11:16:21 BRT 2011
>              Built host: amscosta-desktop
>              C bindings: yes
>            C++ bindings: no
>      Fortran77 bindings: no
>      Fortran90 bindings: no
>  Fortran90 bindings size: na
>              C compiler: gcc
>     C compiler absolute: /usr/bin/gcc
>            C++ compiler: g++
>   C++ compiler absolute: /usr/bin/g++
>      Fortran77 compiler: gfortran
>  Fortran77 compiler abs: /usr/bin/gfortran
>      Fortran90 compiler: none
>  Fortran90 compiler abs: none
>             C profiling: no
>           C++ profiling: no
>     Fortran77 profiling: no
>     Fortran90 profiling: no
>          C++ exceptions: no
>          Thread support: posix (mpi: no, progress: no)
>           Sparse Groups: no
>  Internal debug support: no
>     MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
>         libltdl support: yes
>   Heterogeneous support: no
>  mpirun default --prefix: no
>         MPI I/O support: yes
>       MPI_WTIME support: gettimeofday
> Symbol visibility support: yes
>  ..
>
>
> On Wed, Jun 22, 2011 at 12:34 PM, Dmitry N. Mikushin
>  wrote:
>> Alexandre,
>>
>> Did you have a working Fortran compiler in system in time of OpenMPI
>> compilation? To my experience Fortran bindings are always compiled by
>> default. How did you configured it and have you noticed any messages
>> reg. Fortran support in configure output?
>>
>> - D.
>>
>> 2011/6/22 Alexandre Souza :
>>> Dear Group,
>>> After compiling the openmpi source, the following message is displayed
>>> when trying to compile
>>> the hello program in fortran :
>>> amscosta@amscosta-desktop:~/openmpi-1.4.3/examples$
>>> /opt/openmpi-1.4.3/bin/mpif90 -g hello_f90.f90 -o hello_f90
>>> --

Re: [OMPI users] mpif90 compiler non-functional

2011-06-22 Thread Dmitry N. Mikushin

Alexandre,

Did you have a working Fortran compiler in system in time of OpenMPI
compilation? To my experience Fortran bindings are always compiled by
default. How did you configured it and have you noticed any messages
reg. Fortran support in configure output?

- D.

2011/6/22 Alexandre Souza :
> Dear Group,
> After compiling the openmpi source, the following message is displayed
> when trying to compile
> the hello program in fortran :
> amscosta@amscosta-desktop:~/openmpi-1.4.3/examples$
> /opt/openmpi-1.4.3/bin/mpif90 -g hello_f90.f90 -o hello_f90
> --
> Unfortunately, this installation of Open MPI was not compiled with
> Fortran 90 support.  As such, the mpif90 compiler is non-functional.
> --
> Any clue how to solve it is very welcome.
> Thanks,
> Alex
> P.S. I am using a ubuntu box with gfortran
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] USE mpi

2011-05-08 Thread Dmitry N. Mikushin

Oh, clear now, thank you!

2011/5/8 Steph Bredenhann 

>  Jeff is correct. The Intel environmental variables are either set in
> /etc/profile or /user/.bashrc (or manually). Root sets its own environmental
> variables and therefore the key is to make sure that the environmental
> variables are set before an installation as root is done, i.e.:
>
> source /opt/intel/Compiler/11.1/073/bin/ifortvars.sh intel64
> source /opt/intel/Compiler/11.1/073/bin/iccvars.sh intel64
>
> Then the rest of the procedure can follow.
>
> It sounds simple and it is, perhaps
>
>   --
> Steph Bredenhann
>
>   On Sun, 2011-05-08 at 09:09 -0400, Jeff Squyres (jsquyres) wrote:
>
> Make all gets the same environment as make install (assuming you do it in the 
> same shell). But if you sudo make install, the environment may be different - 
> it may not inherit everything from your environment.
>
> I advised the user to "sudo -s" and ten setup the compiler environment and 
> then run make install.
>
> Sent from my phone. No type good.
>
> On May 7, 2011, at 9:37 PM, "Dmitry N. Mikushin"  wrote:
>
> > Tim,
> >
> > I certainly do not expect anything special, just normally "make
> > install" should not have issues, if "make" passes fine, right? What we
> > have with OpenMPI is this strange difference: if ./configure CC=icc,
> > "make" works, and "make install" - does not; if ./configure
> > CC=/full/path/to/icc, then both "make" and "make install" work.
> > Nothing needs to be searched, icc is already in PATH, since
> > compilevars are sourced in profile.d. Or am I missing something?
> >
> > Thanks,
> > - D.
> >
> > 2011/5/8 Tim Prince :
> >> On 5/7/2011 2:35 PM, Dmitry N. Mikushin wrote:
> >>>>
> >>>> didn't find the icc compiler
> >>>
> >>> Jeff, on 1.4.3 I saw the same issue, even more generally: "make
> >>> install" cannot find the compiler, if it is an alien compiler (i.e.
> >>> not the default gcc) - same situation for intel or llvm, for example.
> >>> The workaround is to specify full paths to compilers with CC=...
> >>> FC=... in ./configure params. Could it be "make install" breaks some
> >>> env paths?
> >>>
> >>
> >> Most likely reason for not finding an installed icc is that the icc
> >> environment (source the compilervars script if you have a current version)
> >> wasn't set prior to running configure.  Setting up the compiler in question
> >> in accordance with its own instructions is a more likely solution than the
> >> absolute path choice.
> >> OpenMPI configure, for good reason, doesn't search your system to see where
> >> a compiler might be installed.  What if you had 2 versions of the same 
> >> named
> >> compiler?
> >> --
> >> Tim Prince
> >> ___
> >> users mailing list
> >> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing 
> listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] USE mpi

2011-05-07 Thread Dmitry N. Mikushin

Tim,

I certainly do not expect anything special, just normally "make
install" should not have issues, if "make" passes fine, right? What we
have with OpenMPI is this strange difference: if ./configure CC=icc,
"make" works, and "make install" - does not; if ./configure
CC=/full/path/to/icc, then both "make" and "make install" work.
Nothing needs to be searched, icc is already in PATH, since
compilevars are sourced in profile.d. Or am I missing something?

Thanks,
- D.

2011/5/8 Tim Prince :
> On 5/7/2011 2:35 PM, Dmitry N. Mikushin wrote:
>>>
>>> didn't find the icc compiler
>>
>> Jeff, on 1.4.3 I saw the same issue, even more generally: "make
>> install" cannot find the compiler, if it is an alien compiler (i.e.
>> not the default gcc) - same situation for intel or llvm, for example.
>> The workaround is to specify full paths to compilers with CC=...
>> FC=... in ./configure params. Could it be "make install" breaks some
>> env paths?
>>
>
> Most likely reason for not finding an installed icc is that the icc
> environment (source the compilervars script if you have a current version)
> wasn't set prior to running configure.  Setting up the compiler in question
> in accordance with its own instructions is a more likely solution than the
> absolute path choice.
> OpenMPI configure, for good reason, doesn't search your system to see where
> a compiler might be installed.  What if you had 2 versions of the same named
> compiler?
> --
> Tim Prince
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] USE mpi

2011-05-07 Thread Dmitry N. Mikushin

> didn't find the icc compiler

Jeff, on 1.4.3 I saw the same issue, even more generally: "make
install" cannot find the compiler, if it is an alien compiler (i.e.
not the default gcc) - same situation for intel or llvm, for example.
The workaround is to specify full paths to compilers with CC=...
FC=... in ./configure params. Could it be "make install" breaks some
env paths?

- D.

2011/5/8 Jeff Squyres :
> We iterated off-list -- the problem was that "sudo make install" didn't find 
> the icc compiler, and therefore didn't complete properly.
>
> It seems that the ompi_info and mpif90 cited in this thread were from some 
> other (broken?) OMPI installation.
>
>
>
> On May 7, 2011, at 3:01 PM, Steph Bredenhann wrote:
>
>> Sorry, I missed the 2nd statement:
>>
>>      Fortran90 bindings: yes
>> Fortran90 bindings size: small
>>       Fortran90 compiler: gfortran
>>   Fortran90 compiler abs: /usr/bin/gfortran
>>      Fortran90 profiling: yes
>>
>>
>> --
>> Steph Bredenhann
>>
>> On Sat, 2011-05-07 at 14:46 -0400, Jeff Squyres wrote:
>>> ompi_info | grep 90
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Help: HPL Problem

2011-05-07 Thread Dmitry N. Mikushin

Eric,

You have a link-time error complaining about the absence of some
libraries. At least two of them libm and libdl must be provided by
system, not by MPI implementation. Could you locate them in
/usr/lib64? Also it should be useful to figure out if the problem is
global or specific to HPL: do you have any errors compiling simple
"hello world" program with OpenMPI?

- D.

2011/5/7 Lee Eric :
> Hi,
>
> I encountered following error messages when I compiled HPL.
>
> make[2]: Entering directory
> `/pool/measure/hpl-2.0/testing/ptest/Linux_PII_FBLAS'
> /pool/MPI/openmpi/bin/mpif90 -DAdd__ -DF77_INTEGER=int
> -DStringSunStyle  -I/pool/measure/hpl-2.0/include
> -I/pool/measure/hpl-2.0/include/Linux_PII_FBLAS
> -I/pool/MPI/openmpi/include -fomit-frame-pointer -O3 -funroll-loops -W
> -Wall -o /pool/measure/hpl-2.0/bin/Linux_PII_FBLAS/xhpl HPL_pddriver.o
>        HPL_pdinfo.o           HPL_pdtest.o
> /pool/measure/hpl-2.0/lib/Linux_PII_FBLAS/libhpl.a
> /pool/libs/BLAS/blas_LINUX.a /pool/MPI/openmpi/lib/libmpi.so
> /usr/bin/ld: cannot find -ldl
> /usr/bin/ld: cannot find -lnsl
> /usr/bin/ld: cannot find -lutil
> /usr/bin/ld: cannot find -lm
> /usr/bin/ld: cannot find -ldl
> /usr/bin/ld: cannot find -lm
> collect2: ld returned 1 exit status
> make[2]: *** [dexe.grd] Error 1
> make[2]: Leaving directory 
> `/pool/measure/hpl-2.0/testing/ptest/Linux_PII_FBLAS'
> make[1]: *** [build_tst] Error 2
> make[1]: Leaving directory `/pool/measure/hpl-2.0'
> make: *** [build] Error 2
>
> And the attachment is the make file I created. OS is Fedora 14 x86_64.
>
> Could anyone show me where is going wrong? Thanks.
>
> Eric
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] OpenMPI-PGI: /usr/bin/ld: Warning: size of symbol `#' changed from # in #.o to # in #.so

2011-03-27 Thread Dmitry N. Mikushin

I checked that this issue is not caused by using different compile
options for different libraries. There is a set of libraries and
executable compiled with mpif90, and this warning comes for
executable's object and one of libraries...

2011/3/25 Dmitry N. Mikushin :
> Hi,
>
> I'm wondering if anybody have seen something similar, and have you
> succeeded to run your application compiled by openmpi-pgi-1.4.2 with
> the following warnings:
>
> /usr/bin/ld: Warning: size of symbol `mpi_fortran_errcodes_ignore_'
> changed from 4 in foo.o to 8 in lib/libfoolib2.so
> /usr/bin/ld: Warning: size of symbol `mpi_fortran_argv_null_' changed
> from 1 in foo.o to 8 in lib/libfoolib2.so
> /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_message_mod_0_' in
> lib/libfoolib1.so is smaller than 32 in foo.o
> /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_abort_mod_0_' in
> lib/libfoolib1.so is smaller than 32 in foo.o
> /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_ioinit_mod_0_' in
> lib/libfoolib1.so is smaller than 32 in foo.o
> /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_gatherv_mod_6_' in
> lib/libfoolib1.so is smaller than 32 in foo.o
>
> Symbols names look like being internal to OpenMPI, there was one
> similar issue in archive back in 2006:
> https://svn.open-mpi.org/trac/ompi/changeset/11057 could it be hit
> again?
>
> Thanks,
> - D.
>

[OMPI users] OpenMPI-PGI: /usr/bin/ld: Warning: size of symbol `#' changed from # in #.o to # in #.so

2011-03-24 Thread Dmitry N. Mikushin

Hi,

I'm wondering if anybody have seen something similar, and have you
succeeded to run your application compiled by openmpi-pgi-1.4.2 with
the following warnings:

/usr/bin/ld: Warning: size of symbol `mpi_fortran_errcodes_ignore_'
changed from 4 in foo.o to 8 in lib/libfoolib2.so
/usr/bin/ld: Warning: size of symbol `mpi_fortran_argv_null_' changed
from 1 in foo.o to 8 in lib/libfoolib2.so
/usr/bin/ld: Warning: alignment 16 of symbol `_mpl_message_mod_0_' in
lib/libfoolib1.so is smaller than 32 in foo.o
/usr/bin/ld: Warning: alignment 16 of symbol `_mpl_abort_mod_0_' in
lib/libfoolib1.so is smaller than 32 in foo.o
/usr/bin/ld: Warning: alignment 16 of symbol `_mpl_ioinit_mod_0_' in
lib/libfoolib1.so is smaller than 32 in foo.o
/usr/bin/ld: Warning: alignment 16 of symbol `_mpl_gatherv_mod_6_' in
lib/libfoolib1.so is smaller than 32 in foo.o

Symbols names look like being internal to OpenMPI, there was one
similar issue in archive back in 2006:
https://svn.open-mpi.org/trac/ompi/changeset/11057 could it be hit
again?

Thanks,
- D.

45 matches

Mail list logo