[OMPI users] (no subject)
Dear all, ompi_info reports pml components are available: $ /usr/mpi/gcc/openmpi-3.1.0rc2/bin/ompi_info -a | grep pml MCA pml: v (MCA v2.1.0, API v2.0.0, Component v3.1.0) MCA pml: monitoring (MCA v2.1.0, API v2.0.0, Component v3.1.0) MCA pml: yalla (MCA v2.1.0, API v2.0.0, Component v3.1.0) MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v3.1.0) MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v3.1.0) MCA pml: ucx (MCA v2.1.0, API v2.0.0, Component v3.1.0) However, when I'm trying to use them, mpirun gives back: -- No components were able to be opened in the pml framework. This typically means that either no components of this type were installed, or none of the installed components can be loaded. Sometimes this means that shared libraries required by these components are unable to be found/loaded. Host: cloudgpu6 Framework: pml -- With the strace I can see the libraries /usr/mpi/gcc/openmpi-3.1.0rc2/lib64/openmpi/mca_pml_* are reached out by mpirun. Then I also can see ldd does not show any unresolved dependencies for them. How else could it be the that pml is not found? Thanks, - Dmitry. ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] EBADF (Bad file descriptor) on a simplest "Hello world" program
ping 2018-06-01 22:29 GMT+03:00 Dmitry N. Mikushin : > Dear all, > > Looks like I have a weird issue never encountered before. While trying to > run simplest "Hello world" program, I get: > > $ cat hello.c > #include > > int main(int argc, char* argv[]) > { > MPI_Init(&argc, &argv); > > MPI_Finalize(); > > return 0; > } > $ mpicc hello.c -o hello > $ mpirun -np 1 ./hello > -- > WARNING: The accept(3) system call failed on a TCP socket. While this > should generally never happen on a well-configured HPC system, the > most common causes when it does occur are: > > * The process ran out of file descriptors > * The operating system ran out of file descriptors > * The operating system ran out of memory > > Your Open MPI job will likely hang until the failure resason is fixed > (e.g., more file descriptors and/or memory becomes available), and may > eventually timeout / abort. > > Local host: M17xR4 > Errno: 9 (Bad file descriptor) > Probable cause: Unknown cause; job will try to continue > -- > > Further tracing shows the following: > > [pid 13498] accept(0, 0x7f2ec8000960, 0x7f2ee6740e7c) = -1 EBADF (Bad file > descriptor) > [pid 13498] shutdown(0, SHUT_RDWR) = -1 EBADF (Bad file descriptor) > [pid 13498] close(0)= -1 EBADF (Bad file descriptor) > [pid 13498] open("/usr/share/openmpi/help-oob-tcp.txt", O_RDONLY) = 0 > [pid 13498] ioctl(0, TCGETS, 0x7f2ee6740be0) = -1 ENOTTY (Inappropriate > ioctl for device) > [pid 13499] <... nanosleep resumed> NULL) = 0 > [pid 13498] fstat(0, > [pid 13499] nanosleep({0, 10}, > [pid 13498] <... fstat resumed> {st_mode=S_IFREG|0644, st_size=3025, ...}) > = 0 > [pid 13498] read(0, "# -*- text -*-\n#\n# Copyright (c)"..., 8192) = 3025 > [pid 13498] read(0, "", 4096) = 0 > [pid 13498] read(0, "", 8192) = 0 > [pid 13498] ioctl(0, TCGETS, 0x7f2ee6740b40) = -1 ENOTTY (Inappropriate > ioctl for device) > [pid 13498] close(0)= 0 > [pid 13499] <... nanosleep resumed> NULL) = 0 > [pid 13499] nanosleep({0, 10}, > [pid 13498] write(1, ""..., > 768- > - > WARNING: The accept(3) system call failed on a TCP socket. While this > should generally never happen on a well-configured HPC system, the > most common causes when it does occur are: > > * The process ran out of file descriptors > * The operating system ran out of file descriptors > * The operating system ran out of memory > > Your Open MPI job will likely hang until the failure resason is fixed > (e.g., more file descriptors and/or memory becomes available), and may > eventually timeout / abort. > > Local host: M17xR4 > Errno: 9 (Bad file descriptor) > Probable cause: Unknown cause; job will try to continue > -- > ) = 768 > > In fact, "Bad file descriptor" first occurs a bit earlier, here: > > [pid 13499] open("/proc/self/fd", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) > = 20 > [pid 13499] fstat(20, {st_mode=S_IFDIR|0500, st_size=0, ...}) = 0 > [pid 13499] getdents(20, /* 25 entries */, 32768) = 600 > [pid 13499] close(3)= 0 > [pid 13499] close(4)= 0 > [pid 13499] close(5)= 0 > [pid 13499] close(6)= 0 > [pid 13499] close(7)= 0 > [pid 13499] close(8)= 0 > [pid 13499] close(9)= 0 > [pid 13499] close(10) = 0 > [pid 13499] close(11) = 0 > [pid 13499] close(12) = 0 > [pid 13499] close(13) = 0 > [pid 13499] close(14) = 0 > [pid 13499] close(15) = 0 > [pid 13499] close(16) = 0 > [pid 13499] close(17) = 0 > [pid 13499] close(18) = 0 > [pid 13499] close(19) = 0 > [pid 13499] close(20) = 0 > [pid 13499] getdents(20, 0x1cc04a0, 32768) = -1 EBADF (Bad file descriptor) > [pid 13499] close(20) = -1 EBADF (Bad file descriptor) > > Any idea how to fix this? System is Ubuntu 16.04: > > Linux M17xR4 4.10.0-42-generic #46~16.04.1-Ubuntu SMP Mon Dec 4 15:57:59 > UTC 2017 x86_64 x86_64 x86_64 GNU/Linux > > Kind regards, > - Dmitry. > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] EBADF (Bad file descriptor) on a simplest "Hello world" program
Dear all, Looks like I have a weird issue never encountered before. While trying to run simplest "Hello world" program, I get: $ cat hello.c #include int main(int argc, char* argv[]) { MPI_Init(&argc, &argv); MPI_Finalize(); return 0; } $ mpicc hello.c -o hello $ mpirun -np 1 ./hello -- WARNING: The accept(3) system call failed on a TCP socket. While this should generally never happen on a well-configured HPC system, the most common causes when it does occur are: * The process ran out of file descriptors * The operating system ran out of file descriptors * The operating system ran out of memory Your Open MPI job will likely hang until the failure resason is fixed (e.g., more file descriptors and/or memory becomes available), and may eventually timeout / abort. Local host: M17xR4 Errno: 9 (Bad file descriptor) Probable cause: Unknown cause; job will try to continue -- Further tracing shows the following: [pid 13498] accept(0, 0x7f2ec8000960, 0x7f2ee6740e7c) = -1 EBADF (Bad file descriptor) [pid 13498] shutdown(0, SHUT_RDWR) = -1 EBADF (Bad file descriptor) [pid 13498] close(0)= -1 EBADF (Bad file descriptor) [pid 13498] open("/usr/share/openmpi/help-oob-tcp.txt", O_RDONLY) = 0 [pid 13498] ioctl(0, TCGETS, 0x7f2ee6740be0) = -1 ENOTTY (Inappropriate ioctl for device) [pid 13499] <... nanosleep resumed> NULL) = 0 [pid 13498] fstat(0, [pid 13499] nanosleep({0, 10}, [pid 13498] <... fstat resumed> {st_mode=S_IFREG|0644, st_size=3025, ...}) = 0 [pid 13498] read(0, "# -*- text -*-\n#\n# Copyright (c)"..., 8192) = 3025 [pid 13498] read(0, "", 4096) = 0 [pid 13498] read(0, "", 8192) = 0 [pid 13498] ioctl(0, TCGETS, 0x7f2ee6740b40) = -1 ENOTTY (Inappropriate ioctl for device) [pid 13498] close(0)= 0 [pid 13499] <... nanosleep resumed> NULL) = 0 [pid 13499] nanosleep({0, 10}, [pid 13498] write(1, ""..., 768-- WARNING: The accept(3) system call failed on a TCP socket. While this should generally never happen on a well-configured HPC system, the most common causes when it does occur are: * The process ran out of file descriptors * The operating system ran out of file descriptors * The operating system ran out of memory Your Open MPI job will likely hang until the failure resason is fixed (e.g., more file descriptors and/or memory becomes available), and may eventually timeout / abort. Local host: M17xR4 Errno: 9 (Bad file descriptor) Probable cause: Unknown cause; job will try to continue -- ) = 768 In fact, "Bad file descriptor" first occurs a bit earlier, here: [pid 13499] open("/proc/self/fd", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 20 [pid 13499] fstat(20, {st_mode=S_IFDIR|0500, st_size=0, ...}) = 0 [pid 13499] getdents(20, /* 25 entries */, 32768) = 600 [pid 13499] close(3)= 0 [pid 13499] close(4)= 0 [pid 13499] close(5)= 0 [pid 13499] close(6)= 0 [pid 13499] close(7)= 0 [pid 13499] close(8)= 0 [pid 13499] close(9)= 0 [pid 13499] close(10) = 0 [pid 13499] close(11) = 0 [pid 13499] close(12) = 0 [pid 13499] close(13) = 0 [pid 13499] close(14) = 0 [pid 13499] close(15) = 0 [pid 13499] close(16) = 0 [pid 13499] close(17) = 0 [pid 13499] close(18) = 0 [pid 13499] close(19) = 0 [pid 13499] close(20) = 0 [pid 13499] getdents(20, 0x1cc04a0, 32768) = -1 EBADF (Bad file descriptor) [pid 13499] close(20) = -1 EBADF (Bad file descriptor) Any idea how to fix this? System is Ubuntu 16.04: Linux M17xR4 4.10.0-42-generic #46~16.04.1-Ubuntu SMP Mon Dec 4 15:57:59 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Kind regards, - Dmitry. ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Crash in libopen-pal.so
Hi Justin, If you can build application in debug mode, try inserting valgrind into your MPI command. It's usually very good in tracking down failing memory allocations origins. Kind regards, - Dmitry. 2017-06-20 1:10 GMT+03:00 Sylvain Jeaugey : > Justin, can you try setting mpi_leave_pinned to 0 to disable libptmalloc2 > and confirm this is related to ptmalloc ? > > Thanks, > Sylvain > On 06/19/2017 03:05 PM, Justin Luitjens wrote: > > I have an application that works on other systems but on the current > system I’m running I’m seeing the following crash: > > > > [dt04:22457] *** Process received signal *** > > [dt04:22457] Signal: Segmentation fault (11) > > [dt04:22457] Signal code: Address not mapped (1) > > [dt04:22457] Failing at address: 0x6a1da250 > > [dt04:22457] [ 0] /lib64/libpthread.so.0(+0xf370)[0x2b353370] > > [dt04:22457] [ 1] /home/jluitjens/libs/openmpi/lib/libopen-pal.so.13(opal_ > memory_ptmalloc2_int_free+0x50)[0x2cbcf810] > > [dt04:22457] [ 2] /home/jluitjens/libs/openmpi/lib/libopen-pal.so.13(opal_ > memory_ptmalloc2_free+0x9b)[0x2cbcff3b] > > [dt04:22457] [ 3] ./hacc_tpm[0x42f068] > > [dt04:22457] [ 4] ./hacc_tpm[0x42f231] > > [dt04:22457] [ 5] ./hacc_tpm[0x40f64d] > > [dt04:22457] [ 6] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2c30db35] > > [dt04:22457] [ 7] ./hacc_tpm[0x4115cf] > > [dt04:22457] *** End of error message *** > > > > > > This app is a CUDA app but doesn’t use GPU direct so that should be > irrelevant. > > > > I’m building with ggc/5.3.0 cuda/8.0.44 openmpi/1.10.7 > > > > I’m using this on centos 7 and am using a vanilla MPI configure line: > ./configure --prefix=/home/jluitjens/libs/openmpi/ > > > > Currently I’m trying to do this with just a single MPI process but > multiple MPI processes fail in the same way: > > > > mpirun --oversubscribe -np 1 ./command > > > > What is odd is the crash occurs around the same spot in the code but not > consistently at the same spot. The spot in the code where the single > thread is at the time of the crash is nowhere near MPI code. The code > where it is crashing is just using malloc to allocate some memory. This > makes me think the crash is due to a thread outside of the application I’m > working on (perhaps in OpenMPI itself) or perhaps due to openmpi hijacking > malloc/free. > > > > Does anyone have any ideas of what I could try to work around this issue? > > > > Thanks, > > Justin > > > > > > > > > > > > > > > > > > > > > > > -- > This email message is for the sole use of the intended recipient(s) and > may contain confidential information. Any unauthorized review, use, > disclosure or distribution is prohibited. If you are not the intended > recipient, please contact the sender by reply email and destroy all copies > of the original message. > -- > > > ___ > users mailing > listus...@lists.open-mpi.orghttps://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > ___ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] MPI + system() call + Matlab MEX crashes
Hi Juraj, Although MPI infrastructure may technically support forking, it's known that not all system resources can correctly replicate themselves to forked process. For example, forking inside MPI program with active CUDA driver will result into crash. Why not to compile down the MATLAB into a native library and link it with the MPI application directly? E.g. like here: https://www.mathworks.com/matlabcentral/answers/98867-how-do-i-create-a-c-shared-library-from-mex-files-using-the-matlab-compiler?requestedDomain=www.mathworks.com Kind regards, - Dmitry Mikushin. 2016-10-05 11:32 GMT+03:00 juraj2...@gmail.com : > Hello, > > I have an application in C++(main.cpp) that is launched with multiple > processes via mpirun. Master process calls matlab via system('matlab > -nosplash -nodisplay -nojvm -nodesktop -r "interface"'), which executes > simple script interface.m that calls mexFunction (mexsolve.cpp) from which > I try to set up communication with the rest of the processes launched at > the beginning together with the master process. When I run the application > as listed below on two different machines I experience: > > 1) crash at MPI_Init() in the mexFunction() on cluster machine with > Linux 4.4.0-22-generic > > 2) error in MPI_Send() shown below on local machine with > Linux 3.10.0-229.el7.x86_64 > [archimedes:31962] shmem: mmap: an error occurred while determining > whether or not > /tmp/openmpi-sessions-1007@archimedes_0/58444/1/shared_mem_pool.archimedes > could be created. > [archimedes:31962] create_and_attach: unable to create shared memory BTL > coordinating structure :: size 134217728 > [archimedes:31962] shmem: mmap: an error occurred while determining > whether or not /tmp/openmpi-sessions-1007@arc > himedes_0/58444/1/0/vader_segment.archimedes.0 could be created. > [archimedes][[58444,1],0][../../../../../opal/mca/btl/tcp/bt > l_tcp_endpoint.c:800:mca_btl_tcp_endpoint_complete_connect] connect() to > failed: Connection refused (111) > > I launch application as following: > mpirun --mca mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 1 -np > 2 -npernode 1 ./main > > I have openmpi-2.0.1 configured with --prefix=${INSTALLDIR} > --enable-mpi-fortran=all --with-pmi --disable-dlopen > > For more details, the code is here: https://github.com/goghino/matlabMpiC > > Thanks for any suggestions! > > Juraj > > ___ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] MPI_File_write hangs on NFS-mounted filesystem
Not sure if this is related, but: I've seen a case of performance degradation on NFS and Lustre when writing NetCDF files. The reason was that the file was filled with a loop writing one 4-byte record at once. Performance became close to local hard drive, when I simply introduced buffering of records and writing them to files with one row at once. - D. 2013/11/7 Steven G Johnson : > The simple C program attached below hangs on MPI_File_write when I am using > an NFS-mounted filesystem. Is MPI-IO supported in OpenMPI for NFS > filesystems? > > I'm using OpenMPI 1.4.5 on Debian stable (wheezy), 64-bit Opteron CPU, Linux > 3.2.51. I was surprised by this because the problems only started occurring > recently when I upgraded my Debian system to wheezy; with OpenMPI in the > previous Debian release, output to NFS-mounted filesystems worked fine. > > Is there any easy way to get this working? Any tips are appreciated. > > Regards, > Steven G. Johnson > > --- > #include > #include > #include > > void perr(const char *label, int err) > { > char s[MPI_MAX_ERROR_STRING]; > int len; > MPI_Error_string(err, s, &len); > printf("%s: %d = %s\n", label, err, s); > } > > int main(int argc, char **argv) > { > MPI_Init(&argc, &argv); > > MPI_File fh; > int err; > err = MPI_File_open(MPI_COMM_WORLD, "tstmpiio.dat", MPI_MODE_CREATE | > MPI_MODE_WRONLY, MPI_INFO_NULL, &fh); > perr("open", err); > > const char s[] = "Hello world!\n"; > MPI_Status status; > err = MPI_File_write(fh, (void*) s, strlen(s), MPI_CHAR, &status); > perr("write", err); > > err = MPI_File_close(&fh); > perr("close", err); > > MPI_Finalize(); > return 0; > } > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Stream interactions in CUDA
Hi Justin, Quick grepping reveals several cuMemcpy calls in OpenMPI. Some of them are even synchronous, meaning stream0. I think the best way of exploring this sort of behavior is to execute OpenMPI runtime (thanks to its open-source nature!) under debugger. Rebuild OpenMPI with -g -O0, add some initial sleep() into your app, such that this time would be sufficient to gdb-attach to one of MPI processes. Once attached, first put break on the beginning of your region of interest and then break on cuMemcpy and cuMemcpyAsync. Best, - D. 2012/12/13 Justin Luitjens > Hello, > > I'm working on an application using OpenMPI with CUDA and GPUDirect. I > would like to get the MPI transfers to overlap with computation on the CUDA > device. To do this I need to ensure that all memory transfers do not go to > stream 0. In this application I have one step that performs an > MPI_Alltoall operation. Ideally I would like this Alltoall operation to be > asynchronous. Thus I have implemented my own Alltoall using Isend and > Irecv. Which can be found at the bottom of this email. > > The profiler shows that this operation has some very odd PCI-E traffic > that I was hoping someone could explain and help me eliminate. In this > example NPES=2 and each process has its own M2090 GPU. I am using cuda 5.0 > and OpenMPI-1.7rc5. The behavior I am seeing is the following. Once the > Isend loop occurs there is a sequence of DtoH followed by HtoD transfers. > These transfers are 256K in size and there are 28 of them that occur. > Each of these transfers are placed in stream0. After this there are a few > more small transfers also placed in stream0. Finally when the 3rd loop > occurs there are 2 DtoD transfers (this is the actual data being exchanged). > > Can anyone explain what all of the traffic ping-ponging back and forth > between the host and device is? Is this traffic necessary? > > Thanks, > Justin > > > uint64_t scatter_gather( uint128 * input_buffer, uint128 *output_buffer, > uint128 *recv_buckets, int* send_sizes, int MAX_RECV_SIZE_PER_PE) { > > std::vector srequest(NPES), rrequest(NPES); > > //Start receives > for(int p=0;p > MPI_Irecv(recv_buckets+MAX_RECV_SIZE_PER_PE*p,MAX_RECV_SIZE_PER_PE,MPI_INT_128,p,0,MPI_COMM_WORLD,&rrequest[p]); > } > > //Start sends > int send_count=0; > for(int p=0;p > MPI_Isend(input_buffer+send_count,send_sizes[p],MPI_INT_128,p,0,MPI_COMM_WORLD,&srequest[p]); > send_count+=send_sizes[p]; > } > > //Process outstanding receives > int recv_count=0; > for(int p=0;p MPI_Status status; > MPI_Wait(&rrequest[p],&status); > int count; > MPI_Get_count(&status,MPI_INT_128,&count); > assert(count > cudaMemcpy(output_buffer+recv_count,recv_buckets+MAX_RECV_SIZE_PER_PE*p,count*sizeof(uint128),cudaMemcpyDeviceToDevice); > recv_count+=count; > } > > //Wait for outstanding sends > for(int p=0;p MPI_Status status; > MPI_Wait(&srequest[p],&status); > } > return recv_count; > } > > > --- > This email message is for the sole use of the intended recipient(s) and > may contain > confidential information. Any unauthorized review, use, disclosure or > distribution > is prohibited. If you are not the intended recipient, please contact the > sender by > reply email and destroy all copies of the original message. > > --- > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] fork in Fortran
Hi, Modern Fortran has a feature called ISO_C_BINDING. It essentially allows to declare a binding of external function to be used from Fortran program. You only need to provide a corresponding interface. ISO_C_BINDING module contains C-like extensions in type system, but you don't need them, as your function has no arguments :) Example: program fork_test interface function fork() bind(C) use iso_c_binding integer(c_int) :: fork end function fork end interface print *, 'My PID = ', fork() end program fork_test $ make gfortran fork.f90 -o fork $ ./fork My PID = 4033 My PID =0 For further info, please refer to language standard http://gcc.gnu.org/wiki/GFortranStandards#Fortran_2003 If you have any questions - consider asking the gfortran community, they are very friendly. Best, - D. 2012/8/30 sudhirs@ : > Dear users, > How to use fork(), vfork() type functions in Fortran programming ?? > > Thanking you in advance > > -- > Sudhir Kumar Sahoo > Ph.D Scholar > Dept. Of Chemistry > IIT Kanpur-208016 > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] bug in CUDA support for dual-processor systems?
Hi Zbigniew, > a) I noticed that on my 6-GPU 2-CPU platform the initialization of CUDA 4.2 > takes a long time, approx 10 seconds. > Do you think I should report this as a bug to nVidia? This is an expected time for creation of driver contexts on so many devices. I'm sure NVIDIA already got thousands of reports on this :) The typical answer is: keep alive context on GPU either by running an X server or by executing "nvidia-smi -l 1" in background. With one of these init time should drop down to ~1 sec or less. - D. 2012/7/31 Zbigniew Koza : > Thanks for a quick reply. > > I do not know much about low-level CUDA and IPC, > but there's no problem using high-level CUDA to determine if > device A can talk to B via GPUDirect (cudaDeviceCanAccessPeer). > Then, for such connections, one only needs to call > cudaDeviceEnablePeerAccess > and then essentially "sit back and laugh" - given correct current device > and stream, functions like cudaMemcpyPeer work irrespectively of whether > GPUDirect > is on or off for a given pair of devices, the only difference being the > speed. > So, I hope it should be possible to implement device-IOH-IOH-device > communication using low-level CUDA. > Such functionality should be an important step in the "CPU-GPU > high-performance war" :-), > as 8-GPU fast-MPI-link systems bring a new meaning to a "GPU node" in GPU > clusters... > > Here is the output of my test program that was aimed at determining > a) aggregate, best-case transfer rate between 6 GPUs running in parallel and > b) whether devices on different IOHs can talk to each other: > > 3 [GB] in 78.6952 [ms] = 38.1218 GB/s (aggregate) > sending 6 bytes from device 0: > 0 -> 0: 11.3454 [ms] 52.8848 GB/s > 0 -> 1: 90.3628 [ms] 6.6399 GB/s > 0 -> 2: 113.396 [ms] 5.29117 GB/s > 0 -> 3: 113.415 [ms] 5.29032 GB/s > 0 -> 4: 170.307 [ms] 3.52305 GB/s > 0 -> 5: 169.613 [ms] 3.53747 GB/s > > This shows that even if devices are on different IOHs, like 0 and 4, they > can talk to each other at a fantastic speed of 3.5 GB/s > and it would be pity if OpenMPI did not used this opportunity. > > I have also 2 questions: > > a) I noticed that on my 6-GPU 2-CPU platform the initialization of CUDA 4.2 > takes a long time, approx 10 seconds. > Do you think I should report this as a bug to nVidia? > > b) Is there any info on running OpenMPI + CUDA? For example, what are the > dependencies of transfer rates and latencies on transfer size? > A dedicated www page, blog or whatever? How can I know if the current > problem was solved? > > > > Many thanks for making CUDA available in OpenMPI. > > Regards > > Z Koza > > W dniu 31.07.2012 19:39, Rolf vandeVaart pisze: > >> The current implementation does assume that the GPUs are on the same IOH >> and therefore can use the IPC features of the CUDA library for >> communication. >> One of the initial motivations for this was that to be able to detect >> whether GPUs can talk to one another, the CUDA library has to be initialized >> and the GPUs have to be selected by each rank. It is at that point that we >> can determine whether the IPC will work between the GPUs.However, this >> means that the GPUs need to be selected by each rank prior to the call to >> MPI_Init as that is where we determine whether IPC is possible, and we were >> trying to avoid that requirement. >> >> I will submit a ticket against this and see if we can improve this. >> >> Rolf >> >>> -Original Message- >>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] >>> On Behalf Of Zbigniew Koza >>> Sent: Tuesday, July 31, 2012 12:38 PM >>> To: us...@open-mpi.org >>> Subject: [OMPI users] bug in CUDA support for dual-processor systems? >>> >>> Hi, >>> >>> I wrote a simple program to see if OpenMPI can really handle cuda >>> pointers as >>> promised in the FAQ and how efficiently. >>> The program (see below) breaks if MPI communication is to be performed >>> between two devices that are on the same node but under different IOHs in >>> a >>> dual-processor Intel machine. >>> Note that cudaMemCpy works for such devices, although not as efficiently >>> as >>> for the devices on the same IOH and GPUDirect enabled. >>> >>> Here's the output from my program: >>> >>> === >>> mpirun -n 6 ./a.out >>> >>> Init >>> Init >>> Init >>> Init >>> Init >>> Init >>> rank: 1, size: 6 >>> rank: 2, size: 6 >>> rank: 3, size: 6 >>> rank: 4, size: 6 >>> rank: 5, size: 6 >>> rank: 0, size: 6 >>> device 3 is set >>> Process 3 is on typhoon1 >>> Using regular memory >>> device 0 is set >>> Process 0 is on typhoon1 >>> Using regular memory >>> device 4 is set >>> Process 4 is on typhoon1 >>> Using regular memory >>> device 1 is set >>> Process 1 is on typhoon1 >>> Using regular memory >>> device 5 is set >>> Process 5 is on typhoon1 >>> Using regular memory >>> device 2 is set >>> Process 2 is on typhoon1 >>> Using regular memory >>> ^C^[[A^C >>> zkoza@typhoon1:~/multigpu$ >
Re: [OMPI users] undefined reference to `netcdf_mp_nf90_open_'
Dear Syed, Why do you think it is related to MPI? You seem to be compiling the COSMO model, which depends on netcdf lib, but the symbols are not passed to linker by some reason. Two main reasons are: (1) the library linking flag is missing (check you have something like -lnetcdf -lnetcdff in your linker command line), (2) The netcdf Fortran bindings are compiled with a different naming notation (check names in the lib really contain the expected number of final underscores). I compiled cosmo 4.22 with openmpi and netcdf not long ago without any problems. Best, - Dima. 2012/6/26 Syed Ahsan Ali > Dear All > > I am getting following error while compilation of an application. Seems > like something related to netcdf and mpif90. Although I have compiled > netcdf with mpif90 option, dont why this error is happening. Any hint would > be highly appreciated. > > > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/obj/src_obs_proc_cdf.o: In > function `src_obs_proc_cdf_mp_obs_cdf_read_org_': > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x17aa): > undefined reference to `netcdf_mp_nf90_open_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/obj/src_obs_proc_cdf.o: In > function `src_obs_proc_cdf_mp_obs_cdf_read_temp_pilot_': > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x1000e): > undefined reference to `netcdf_mp_nf90_inq_varid_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10039): > undefined reference to `netcdf_mp_nf90_inq_varid_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10064): > undefined reference to `netcdf_mp_nf90_inq_varid_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x1008b): > undefined reference to `netcdf_mp_nf90_inq_varid_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x100c8): > undefined reference to `netcdf_mp_nf90_inq_varid_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10227): > undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x102eb): > undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x103af): > undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10473): > undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10559): > undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10890): > undefined reference to `netcdf_mp_nf90_inq_varid_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x108bb): > undefined reference to `netcdf_mp_nf90_inq_varid_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x108e2): > undefined reference to `netcdf_mp_nf90_inq_varid_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10909): > undefined reference to `netcdf_mp_nf90_inq_varid_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10930): > undefined reference to `netcdf_mp_nf90_inq_varid_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/obj/src_obs_proc_cdf.o:/home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x109e8): > more undefined references to `netcdf_mp_nf90_inq_varid_' follow > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/obj/src_obs_proc_cdf.o: In > function `src_obs_proc_cdf_mp_obs_cdf_read_temp_pilot_': > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10abc): > undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10b8c): > undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10c5c): > undefined reference to `netcdf_mp_nf90_get_var_1d_eightbytereal_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10d2c): > undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10dfc): > undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10ecc): > undefined reference to `netcdf_mp_nf90_get_var_1d_fourbyteint_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src/src_obs_proc_cdf.f90:(.text+0x10ef3): > undefined reference to `netcdf_mp_nf90_inq_varid_' > > /home/pmdtest/cosmo/source/cosmo_110525_4.18/src
Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__" does not take arguments
Dear Rolf, I compiled openmpi-trunk with $ ../configure --prefix=/opt/openmpi-trunk --disable-mpi-interface-warning --with-cuda=/opt/cuda And that error is now gone! Thanks a lot for your assistance, - D. 2012/6/19 Rolf vandeVaart > Dmitry: > > ** ** > > It turns out that by default in Open MPI 1.7, configure enables warnings > for deprecated MPI functionality. In Open MPI 1.6, these warnings were > disabled by default. > > That explains why you would not see this issue in the earlier versions of > Open MPI. > > ** ** > > I assume that gcc must have added support for > __attribute__((__deprecated__)) and then later on > __attribute__((__deprecated__(msg))) and your version of gcc supports both > of these. (My version of gcc, 4.5.1 does not support the msg in the > attribute) > > ** ** > > The version of nvcc you have does not support the "msg" argument so > everything blows up. > > ** ** > > I suggest you configure with -disable-mpi-interface-warning which will > prevent any of the deprecated attributes from being used and then things > should work fine. > > ** ** > > Let me know if this fixes your problem. > > ** ** > > Rolf > > ** ** > > *From:* users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] *On > Behalf Of *Rolf vandeVaart > *Sent:* Monday, June 18, 2012 11:00 AM > > *To:* Open MPI Users > *Cc:* Олег Рябков > *Subject:* Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__" > does not take arguments > > ** ** > > Hi Dmitry: > > Let me look into this. > > ** ** > > Rol*f* > > ** ** > > *From:* users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] *On > Behalf Of *Dmitry N. Mikushin > *Sent:* Monday, June 18, 2012 10:56 AM > *To:* Open MPI Users > *Cc:* Олег Рябков > *Subject:* Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__" > does not take arguments > > ** ** > > Yeah, definitely. Thank you, Jeff. > > - D. > > 2012/6/18 Jeff Squyres > > On Jun 18, 2012, at 10:41 AM, Dmitry N. Mikushin wrote: > > > No, I'm configuring with gcc, and for openmpi-1.6 it works with nvcc > without a problem. > > Then I think Rolf (from Nvidia) should figure this out; I don't have > access to nvcc. :-) > > > > Actually, nvcc always meant to be more or less compatible with gcc, as > far as I know. I'm guessing in case of trunk nvcc is the source of the > issue. > > > > And with ./configure CC=nvcc etc. it won't build: > > > /home/dmikushin/forge/openmpi-trunk/opal/mca/event/libevent2019/libevent/include/event2/util.h:126:2: > error: #error "No way to define ev_uint64_t" > > You should complain to Nvidia about that. > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ** ** > -- > > This email message is for the sole use of the intended recipient(s) and > may contain confidential information. Any unauthorized review, use, > disclosure or distribution is prohibited. If you are not the intended > recipient, please contact the sender by reply email and destroy all copies > of the original message. > -- > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__" does not take arguments
Yeah, definitely. Thank you, Jeff. - D. 2012/6/18 Jeff Squyres > On Jun 18, 2012, at 10:41 AM, Dmitry N. Mikushin wrote: > > > No, I'm configuring with gcc, and for openmpi-1.6 it works with nvcc > without a problem. > > Then I think Rolf (from Nvidia) should figure this out; I don't have > access to nvcc. :-) > > > Actually, nvcc always meant to be more or less compatible with gcc, as > far as I know. I'm guessing in case of trunk nvcc is the source of the > issue. > > > > And with ./configure CC=nvcc etc. it won't build: > > > /home/dmikushin/forge/openmpi-trunk/opal/mca/event/libevent2019/libevent/include/event2/util.h:126:2: > error: #error "No way to define ev_uint64_t" > > You should complain to Nvidia about that. > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__" does not take arguments
No, I'm configuring with gcc, and for openmpi-1.6 it works with nvcc without a problem. Actually, nvcc always meant to be more or less compatible with gcc, as far as I know. I'm guessing in case of trunk nvcc is the source of the issue. And with ./configure CC=nvcc etc. it won't build: /home/dmikushin/forge/openmpi-trunk/opal/mca/event/libevent2019/libevent/include/event2/util.h:126:2: error: #error "No way to define ev_uint64_t" Thanks, - D. 2012/6/18 Jeff Squyres > Did you configure and build Open MPI with nvcc? > > I ask because Open MPI should auto-detect whether the underlying compiler > can handle a message argument with the deprecated directive or not. > > You should be able to build Open MPI with: > >./configure CC=nvcc etc. >make clean all install > > If you're building Open MPI with one compiler and then trying to compile > with another (like the command line in your mail implies), all bets are off > because Open MPI has tuned itself to the compiler that it was configured > with. > > > > > On Jun 18, 2012, at 10:20 AM, Dmitry N. Mikushin wrote: > > > Hello, > > > > With openmpi svn trunk as of > > > > Repository Root: http://svn.open-mpi.org/svn/ompi > > Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe > > Revision: 26616 > > > > we are observing the following strange issue (see below). How do you > think, is it a problem of NVCC or OpenMPI? > > > > Thanks, > > - Dima. > > > > [dmikushin@tesla-apc mpitest]$ cat mpitest.cu > > #include > > > > __global__ void kernel() { } > > > > [dmikushin@tesla-apc mpitest]$ nvcc -I/opt/openmpi-trunk/include -c > mpitest.cu > > /opt/openmpi-trunk/include/mpi.h(365): error: attribute "__deprecated__" > does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(374): error: attribute "__deprecated__" > does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(382): error: attribute "__deprecated__" > does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(724): error: attribute "__deprecated__" > does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(730): error: attribute "__deprecated__" > does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(736): error: attribute "__deprecated__" > does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(790): error: attribute "__deprecated__" > does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(791): error: attribute "__deprecated__" > does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1049): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1070): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1072): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1074): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1145): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1149): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1151): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1345): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1347): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1484): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1507): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1510): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1515): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1525): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1527): error: attribute > "__deprecated__" does not take arguments > > > > /opt/openmpi-trunk/include/mpi.h(1589): error: attribute > "__deprecated__" does not take ar
[OMPI users] NVCC mpi.h: error: attribute "__deprecated__" does not take arguments
Hello, With openmpi svn trunk as of Repository Root: http://svn.open-mpi.org/svn/ompi Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe Revision: 26616 we are observing the following strange issue (see below). How do you think, is it a problem of NVCC or OpenMPI? Thanks, - Dima. [dmikushin@tesla-apc mpitest]$ cat mpitest.cu #include __global__ void kernel() { } [dmikushin@tesla-apc mpitest]$ nvcc -I/opt/openmpi-trunk/include -c mpitest.cu /opt/openmpi-trunk/include/mpi.h(365): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(374): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(382): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(724): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(730): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(736): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(790): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(791): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1049): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1070): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1072): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1074): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1145): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1149): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1151): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1345): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1347): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1484): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1507): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1510): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1515): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1525): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1527): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1589): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1610): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1612): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1614): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1685): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1689): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1691): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1886): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(1888): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(2024): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(2047): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(2050): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(2055): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(2065): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/mpi.h(2067): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/openmpi/ompi/mpi/cxx/comm.h(102): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/openmpi/ompi/mpi/cxx/win.h(90): error: attribute "__deprecated__" does not take arguments /opt/openmpi-trunk/include/openmpi/ompi/mpi/cxx/file.h(298): error: attribute "__deprecated__" does not take arguments 41 errors detected in the compilation of "/tmp/tmpxft_4a17_-4_mpitest.cpp1.ii".
Re: [OMPI users] starting open-mpi
Hi Ghobad, The error message means the OpenMPI wants to use cl.exe - the compiler from Microsoft Visual Studio. Here http://www.open-mpi.org/software/ompi/v1.5/ms-windows.php is it stated: This is the first binary release for Windows, with basic MPI libraries and executables. The supported platforms are Windows XP, Windows Vista, Windows Server 2003/2008, and Windows 7 (including both 32 and 64 bit versions). The installers were configured with CMake 2.8.1 and compiled under Visual Studio 2010, and they support for C/C++ compilers of Visual Studio 2005, 2008 and 2010. So, to compile MPI programs you probably need one of this compilers to be installed. Best regards. - Dima. 2012/5/10 Ghobad Zarrinchian : > Hi all. I'm a new open-mpi user. I've downloaded the > OpenMPI_v1.5.5-1_win32.exe file to install open-mpi on my dual-core windows > 7 machine. I installed the file but now i can't compile my mpi programs. I > use command below (in command prompt window) to compile my 'test.cpp' > program: > >>> mpic++ -o test test.cpp > > but i get error as follows: > >>> The open mpi wrapper compiler was unable to find the specified compiler >>> cl.exe in your path. > Note that this compiler was either specified at configure time or in > one several possible environment variables. > > What is the problem? Is my compilation command right? Is there any remained > necessary steps to complete my open-mpi installation? > Is it necessary to specify some environment variables? > > Thanks in advanced. > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] possibly undefined macro: AC_PROG_LIBTOOL
OK, apparently that were various backports. And now I have only one autoconf installed, and it is: marcusmae@teslatron:~/Programming/openmpi-r24785$ autoconf --version autoconf (GNU Autoconf) 2.67 marcusmae@teslatron:~/Programming/openmpi-r24785$ autoreconf --version autoreconf (GNU Autoconf) 2.67 However: 6. Processing autogen.subdirs directories === Processing subdir: /home/marcusmae/Programming/openmpi-r24785/opal/mca/event/libevent207/libevent --- Found autogen.sh; running... autoreconf: Entering directory `.' autoreconf: configure.in: not using Gettext autoreconf: running: aclocal --force -I m4 autoreconf: configure.in: tracing autoreconf: configure.in: not using Libtool autoreconf: running: /usr/bin/autoconf --force configure.in:113: error: possibly undefined macro: AC_PROG_LIBTOOL If this token and others are legitimate, please use m4_pattern_allow. See the Autoconf documentation. autoreconf: /usr/bin/autoconf failed with exit status: 1 Command failed: ./autogen.sh Does it work for you with 2.67? Thanks, - D. 2011/12/30 Ralph Castain : > > On Dec 29, 2011, at 3:39 PM, Dmitry N. Mikushin wrote: > >> No, that was autoREconf, and all they are below 2.65: >> >> marcusmae@teslatron:~/Programming/openmpi-r24785$ ls /usr/bin/autoreconf >> autoreconf autoreconf2.13 autoreconf2.50 autoreconf2.59 >> autoreconf2.64 >> >> And default one points to 2.50: >> >> marcusmae@teslatron:~/Programming/openmpi-r24785$ autoreconf -help >> Usage: /usr/bin/autoreconf2.50 [OPTION]... [DIRECTORY]... >> >> I don't know why, probably that's the default Debian Squeeze setup? > > Probably - but that's no good. It should be the same level as autoconf as the > two are packaged together to avoid incompatibilities like you are hitting > here. Did you install autoconf yourself? If so, can you point autoreconf to > the corresponding binary? > >> >> - D. >> >> 2011/12/30 Ralph Castain : >>> Strange - if you look at your original output, autoconf is identified as >>> 2.50 - a version that is way too old for us. However, what you just sent >>> now shows 2.67, which would be fine. >>> >>> Why the difference? >>> >>> >>> On Dec 29, 2011, at 3:27 PM, Dmitry N. Mikushin wrote: >>> >>>> Hi Ralph, >>>> >>>> URL: http://svn.open-mpi.org/svn/ompi/trunk >>>> Repository Root: http://svn.open-mpi.org/svn/ompi >>>> Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe >>>> Revision: 24785 >>>> Node Kind: directory >>>> Schedule: normal >>>> Last Changed Author: rhc >>>> Last Changed Rev: 24785 >>>> Last Changed Date: 2011-06-17 22:01:23 +0400 (Fri, 17 Jun 2011) >>>> >>>> 1. Checking tool versions >>>> >>>> Searching for autoconf >>>> Found autoconf version 2.67; checking version... >>>> Found version component 2 -- need 2 >>>> Found version component 67 -- need 65 >>>> ==> ACCEPTED >>>> Searching for libtoolize >>>> Found libtoolize version 2.2.6b; checking version... >>>> Found version component 2 -- need 2 >>>> Found version component 2 -- need 2 >>>> Found version component 6b -- need 6b >>>> ==> ACCEPTED >>>> Searching for automake >>>> Found automake version 1.11.1; checking version... >>>> Found version component 1 -- need 1 >>>> Found version component 11 -- need 11 >>>> Found version component 1 -- need 1 >>>> ==> ACCEPTED >>>> >>>> 2011/12/30 Ralph Castain : >>>>> Are you doing this on a subversion checkout? Of which branch? >>>>> >>>>> Did you check your autotoll versions to ensure you meet the minimum >>>>> required levels? The requirements differ by version. >>>>> >>>>> On Dec 29, 2011, at 2:52 PM, Dmitry N. Mikushin wrote: >>>>> >>>>>> Dear Open MPI Community, >>>>>> >>>>>> I need a custom OpenMPI build. While running ./autogen.pl on Debian >>>>>> Squeeze, there is an error: >>>>>> >>>>>> --- Found autogen.sh; running... >>>>>> autoreconf2.50: Entering directory `.' >>>>>> autoreconf2.50: configure.in: not using Gettext >>>>>> autoreconf2.50: running: aclocal --force -I m4 >>>>>> autoreconf2.50: configure.in: tracing &g
Re: [OMPI users] possibly undefined macro: AC_PROG_LIBTOOL
No, that was autoREconf, and all they are below 2.65: marcusmae@teslatron:~/Programming/openmpi-r24785$ ls /usr/bin/autoreconf autoreconf autoreconf2.13 autoreconf2.50 autoreconf2.59 autoreconf2.64 And default one points to 2.50: marcusmae@teslatron:~/Programming/openmpi-r24785$ autoreconf -help Usage: /usr/bin/autoreconf2.50 [OPTION]... [DIRECTORY]... I don't know why, probably that's the default Debian Squeeze setup? - D. 2011/12/30 Ralph Castain : > Strange - if you look at your original output, autoconf is identified as 2.50 > - a version that is way too old for us. However, what you just sent now shows > 2.67, which would be fine. > > Why the difference? > > > On Dec 29, 2011, at 3:27 PM, Dmitry N. Mikushin wrote: > >> Hi Ralph, >> >> URL: http://svn.open-mpi.org/svn/ompi/trunk >> Repository Root: http://svn.open-mpi.org/svn/ompi >> Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe >> Revision: 24785 >> Node Kind: directory >> Schedule: normal >> Last Changed Author: rhc >> Last Changed Rev: 24785 >> Last Changed Date: 2011-06-17 22:01:23 +0400 (Fri, 17 Jun 2011) >> >> 1. Checking tool versions >> >> Searching for autoconf >> Found autoconf version 2.67; checking version... >> Found version component 2 -- need 2 >> Found version component 67 -- need 65 >> ==> ACCEPTED >> Searching for libtoolize >> Found libtoolize version 2.2.6b; checking version... >> Found version component 2 -- need 2 >> Found version component 2 -- need 2 >> Found version component 6b -- need 6b >> ==> ACCEPTED >> Searching for automake >> Found automake version 1.11.1; checking version... >> Found version component 1 -- need 1 >> Found version component 11 -- need 11 >> Found version component 1 -- need 1 >> ==> ACCEPTED >> >> 2011/12/30 Ralph Castain : >>> Are you doing this on a subversion checkout? Of which branch? >>> >>> Did you check your autotoll versions to ensure you meet the minimum >>> required levels? The requirements differ by version. >>> >>> On Dec 29, 2011, at 2:52 PM, Dmitry N. Mikushin wrote: >>> >>>> Dear Open MPI Community, >>>> >>>> I need a custom OpenMPI build. While running ./autogen.pl on Debian >>>> Squeeze, there is an error: >>>> >>>> --- Found autogen.sh; running... >>>> autoreconf2.50: Entering directory `.' >>>> autoreconf2.50: configure.in: not using Gettext >>>> autoreconf2.50: running: aclocal --force -I m4 >>>> autoreconf2.50: configure.in: tracing >>>> autoreconf2.50: configure.in: not using Libtool >>>> autoreconf2.50: running: /usr/bin/autoconf --force >>>> configure.in:113: error: possibly undefined macro: AC_PROG_LIBTOOL >>>> If this token and others are legitimate, please use m4_pattern_allow. >>>> See the Autoconf documentation. >>>> autoreconf2.50: /usr/bin/autoconf failed with exit status: 1 >>>> Command failed: ./autogen.sh >>>> >>>> It's a bit confusing, because automake, libtool, autoconf are >>>> installed. What might be the other reasons of this error? >>>> >>>> Thanks, >>>> - Dima. >>>> ___ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] possibly undefined macro: AC_PROG_LIBTOOL
Hi Ralph, URL: http://svn.open-mpi.org/svn/ompi/trunk Repository Root: http://svn.open-mpi.org/svn/ompi Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe Revision: 24785 Node Kind: directory Schedule: normal Last Changed Author: rhc Last Changed Rev: 24785 Last Changed Date: 2011-06-17 22:01:23 +0400 (Fri, 17 Jun 2011) 1. Checking tool versions Searching for autoconf Found autoconf version 2.67; checking version... Found version component 2 -- need 2 Found version component 67 -- need 65 ==> ACCEPTED Searching for libtoolize Found libtoolize version 2.2.6b; checking version... Found version component 2 -- need 2 Found version component 2 -- need 2 Found version component 6b -- need 6b ==> ACCEPTED Searching for automake Found automake version 1.11.1; checking version... Found version component 1 -- need 1 Found version component 11 -- need 11 Found version component 1 -- need 1 ==> ACCEPTED 2011/12/30 Ralph Castain : > Are you doing this on a subversion checkout? Of which branch? > > Did you check your autotoll versions to ensure you meet the minimum required > levels? The requirements differ by version. > > On Dec 29, 2011, at 2:52 PM, Dmitry N. Mikushin wrote: > >> Dear Open MPI Community, >> >> I need a custom OpenMPI build. While running ./autogen.pl on Debian >> Squeeze, there is an error: >> >> --- Found autogen.sh; running... >> autoreconf2.50: Entering directory `.' >> autoreconf2.50: configure.in: not using Gettext >> autoreconf2.50: running: aclocal --force -I m4 >> autoreconf2.50: configure.in: tracing >> autoreconf2.50: configure.in: not using Libtool >> autoreconf2.50: running: /usr/bin/autoconf --force >> configure.in:113: error: possibly undefined macro: AC_PROG_LIBTOOL >> If this token and others are legitimate, please use m4_pattern_allow. >> See the Autoconf documentation. >> autoreconf2.50: /usr/bin/autoconf failed with exit status: 1 >> Command failed: ./autogen.sh >> >> It's a bit confusing, because automake, libtool, autoconf are >> installed. What might be the other reasons of this error? >> >> Thanks, >> - Dima. >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] possibly undefined macro: AC_PROG_LIBTOOL
Dear Open MPI Community, I need a custom OpenMPI build. While running ./autogen.pl on Debian Squeeze, there is an error: --- Found autogen.sh; running... autoreconf2.50: Entering directory `.' autoreconf2.50: configure.in: not using Gettext autoreconf2.50: running: aclocal --force -I m4 autoreconf2.50: configure.in: tracing autoreconf2.50: configure.in: not using Libtool autoreconf2.50: running: /usr/bin/autoconf --force configure.in:113: error: possibly undefined macro: AC_PROG_LIBTOOL If this token and others are legitimate, please use m4_pattern_allow. See the Autoconf documentation. autoreconf2.50: /usr/bin/autoconf failed with exit status: 1 Command failed: ./autogen.sh It's a bit confusing, because automake, libtool, autoconf are installed. What might be the other reasons of this error? Thanks, - Dima.
Re: [OMPI users] How "CUDA Init prior to MPI_Init" co-exists with unique GPU for each MPI process?
Dear Matthieu, Rolf, Thank you! But normally CUDA device selection is based on MPI process index. So, cuda context must exist where MPI index is not yet available. What is the best practice of process<->GPU mapping in this case? Or can I select any device prior to MPI_Init and later change to another device? - D. 2011/12/14 Rolf vandeVaart : > To add to this, yes, we recommend that the CUDA context exists prior to a > call to MPI_Init. That is because a CUDA context needs to exist prior to > MPI_Init as the library attempts to register some internal buffers with the > CUDA library that require a CUDA context exists already. Note that this is > only relevant if you plan to send and receive CUDA device memory directly > from MPI calls. There is a little more about this in the FAQ here. > > > > http://www.open-mpi.org/faq/?category=running#mpi-cuda-support > > > > > > Rolf > > > > From: Matthieu Brucher [mailto:matthieu.bruc...@gmail.com] > Sent: Wednesday, December 14, 2011 10:47 AM > To: Open MPI Users > Cc: Rolf vandeVaart > Subject: Re: [OMPI users] How "CUDA Init prior to MPI_Init" co-exists with > unique GPU for each MPI process? > > > > Hi, > > > > Processes are not spawned by MPI_Init. They are spawned before by some > applications between your mpirun call and when your program starts. When it > does, you already have all MPI processes (you can check by adding a sleep or > something like that), but they are not synchronized and do not know each > other. This is what MPI_Init is used for. > > > > Matthieu Brucher > > 2011/12/14 Dmitry N. Mikushin > > Dear colleagues, > > For GPU Winter School powered by Moscow State University cluster > "Lomonosov", the OpenMPI 1.7 was built to test and popularize CUDA > capabilities of MPI. There is one strange warning I cannot understand: > OpenMPI runtime suggests to initialize CUDA prior to MPI_Init. Sorry, > but how could it be? I thought processes are spawned during MPI_Init, > and such context will be created on the very first root process. Why > do we need existing CUDA context before MPI_Init? I think there was no > such error in previous versions. > > Thanks, > - D. > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > -- > Information System Engineer, Ph.D. > Blog: http://matt.eifelle.com > LinkedIn: http://www.linkedin.com/in/matthieubrucher > > > This email message is for the sole use of the intended recipient(s) and may > contain confidential information. Any unauthorized review, use, disclosure > or distribution is prohibited. If you are not the intended recipient, > please contact the sender by reply email and destroy all copies of the > original message. > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] How "CUDA Init prior to MPI_Init" co-exists with unique GPU for each MPI process?
Dear colleagues, For GPU Winter School powered by Moscow State University cluster "Lomonosov", the OpenMPI 1.7 was built to test and popularize CUDA capabilities of MPI. There is one strange warning I cannot understand: OpenMPI runtime suggests to initialize CUDA prior to MPI_Init. Sorry, but how could it be? I thought processes are spawned during MPI_Init, and such context will be created on the very first root process. Why do we need existing CUDA context before MPI_Init? I think there was no such error in previous versions. Thanks, - D.
Re: [OMPI users] configure with cuda
> CUDA is an Nvidia-only technology, so it might be a bit limiting in some > cases. I think here it's more a question of compatibility (that is ~ 1.0 / [magnitude of effort]), rather than corporate selfishness >:) Consider memory buffers implementation - counter to CUDA in OpenCL they are some abstract containers, not plain pointers (cl_mem). So, to combine OpenCL with MPI, one first need to propose and adopt suitable API design solution. At least this one is not an easy task, IMO. - D. 2011/10/27 Durga Choudhury : > Is there any provision/future plans to add OpenCL support as well? > CUDA is an Nvidia-only technology, so it might be a bit limiting in > some cases. > > Best regards > Durga > > > On Thu, Oct 27, 2011 at 2:45 PM, Rolf vandeVaart > wrote: >> Actually, that is not quite right. From the FAQ: >> >> >> >> “This feature currently only exists in the trunk version of the Open MPI >> library.” >> >> >> >> You need to download and use the trunk version for this to work. >> >> >> >> http://www.open-mpi.org/nightly/trunk/ >> >> >> >> Rolf >> >> >> >> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On >> Behalf Of Ralph Castain >> Sent: Thursday, October 27, 2011 11:43 AM >> To: Open MPI Users >> Subject: Re: [OMPI users] configure with cuda >> >> >> >> >> >> I'm pretty sure cuda support was never moved to the 1.4 series. You will, >> however, find it in the 1.5 series. I suggest you get the latest tarball >> from there. >> >> >> >> >> >> On Oct 27, 2011, at 12:38 PM, Peter Wells wrote: >> >> >> >> I am attempting to configure OpenMPI 1.4.3 with cuda support on a Redhat 5 >> box. When I try to run configure with the following command: >> >> >> >> ./configure >> --prefix=/opt/crc/sandbox/pwells2/openmpi/1.4.3/intel-12.0-cuda/ FC=ifort >> F77=ifort CXX=icpc CC=icc --with-sge --disable-dlopen --enable-static >> --enable-shared --disable-openib-connectx-xrc --disable-openib-rdmacm >> --without-openib --with-cuda=/opt/crc/cuda/4.0/cuda >> --with-cuda-libdir=/opt/crc/cuda/4.0/cuda/lib64 >> >> >> >> I receive the warning that '--with-cuda' and '--with-cuda-libdir' are >> unrecognized options. According to the FAQ these options are supported in >> this version of OpenMPI. I attempted the same thing with v.1.4.4 downloaded >> directly from open-mpi.org with similar results. Attached are the results of >> configure and make on v.1.4.3. Any help would be greatly appreciated. >> >> >> >> Peter Wells >> HPC Intern >> Center for Research Computing >> University of Notre Dame >> pwel...@nd.edu >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> >> This email message is for the sole use of the intended recipient(s) and may >> contain confidential information. Any unauthorized review, use, disclosure >> or distribution is prohibited. If you are not the intended recipient, >> please contact the sender by reply email and destroy all copies of the >> original message. >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] OpenMPI with CPU of different speed.
Hi, Maybe Mickaël means load balancing could be achieved simply by spawning various number of MPI processes, depending on how many cores particular node has? This should be possible, but accuracy of such balancing will be task-dependent due to other factors, like memory operations and communications. - D. 2011/10/5 Andreas Schäfer : > I'm afraid you'll have to do this kind of load balancing in your > application itself as Open MPI (just like any other MPI implementation) > has no notion of how your application manages its workload. > > HTH > -Andreas > > > On 14:05 Wed 05 Oct , Mickaël CANÉVET wrote: >> Hi, >> >> Is there a way to define a weight to the CPUs of the hosts. I have a >> cluster made of machine from different generation and when I run a >> process on it, the whole cluster is slowed down by the slowest node. >> >> What I'd like to do is something like that in my hostfile: >> >> oldest slots=4 weight=0.75 >> newer slots=8 weight=0.95 >> newest slots=12 weight=1 >> >> So that CPUs of oldest (and slowest) machine gets less data to process. >> >> Thank you >> Mickaël > > > >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > == > Andreas Schäfer > HPC and Grid Computing > Chair of Computer Science 3 > Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany > +49 9131 85-27910 > PGP/GPG key via keyserver > http://www.libgeodecomp.org > == > > (\___/) > (+'.'+) > (")_(") > This is Bunny. Copy and paste Bunny into your > signature to help him gain world domination! > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] [SOLVED] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*
Ok, here's the solution: remove --as-needed option out of compiler's internal linker invocation command line. Steps to do this: 1) Dump compiler specs: $ gcc -dumpspecs > specs 2) Open specs file for edit and remove --as-needed from the line *link: %{!r:--build-id} --no-add-needed --as-needed %{!static:--eh-frame-hdr} %{!m32:-m elf_x86_64} %{m32:-m elf_i386} --hash-style=gnu %{shared:-shared} %{!shared: %{!static: %{rdynamic:-export-dynamic} %{m32:-dynamic-linker %{muclibc:/lib/ld-uClibc.so.0;:%{mbionic:/system/bin/linker;:/lib/ld-linux.so.2}}} %{!m32:-dynamic-linker %{muclibc:/lib/ld64-uClibc.so.0;:%{mbionic:/system/bin/linker64;:/lib64/ld-linux-x86-64.so.2 %{static:-static}} resulting into *link: %{!r:--build-id} --no-add-needed %{!static:--eh-frame-hdr} %{!m32:-m elf_x86_64} %{m32:-m elf_i386} --hash-style=gnu %{shared:-shared} %{!shared: %{!static: %{rdynamic:-export-dynamic} %{m32:-dynamic-linker %{muclibc:/lib/ld-uClibc.so.0;:%{mbionic:/system/bin/linker;:/lib/ld-linux.so.2}}} %{!m32:-dynamic-linker %{muclibc:/lib/ld64-uClibc.so.0;:%{mbionic:/system/bin/linker64;:/lib64/ld-linux-x86-64.so.2 %{static:-static}} 3) Save specs file into compiler's folder /usr/lib/gcc/// For example, in case of Ubuntu 10.10 with gcc 4.6.1 it's /usr/lib/gcc/x86_64-linux-gnu/4.6.1/ With this change no unresolvable relocations anymore! - D. 2011/10/3 Dmitry N. Mikushin : > Hi, > > Here's a reprocase, the same one as mentioned here: > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=608901 > > marcusmae@loveland:~/Programming/mpitest$ cat mpitest.f90 > program main > include 'mpif.h' > integer ierr > call mpi_init(ierr) > end > > marcusmae@loveland:~/Programming/mpitest$ mpif90 -g mpitest.f90 > /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x542): unresolvable > R_X86_64_64 relocation against symbol `mpi_fortran_argv_null_' > /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x55c): unresolvable > R_X86_64_64 relocation against symbol `mpi_fortran_argv_null_' > /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x5d2): unresolvable > R_X86_64_64 relocation against symbol `mpi_fortran_errcodes_ignore_' > /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x5ec): unresolvable > R_X86_64_64 relocation against symbol `mpi_fortran_errcodes_ignore_' > > Remove "-g", and the error will be gone. > > marcusmae@loveland:~/Programming/mpitest$ mpif90 --showme -g mpitest.f90 > gfortran -g mpitest.f90 -I/opt/openmpi_gcc-1.5.4/include -pthread > -I/opt/openmpi_gcc-1.5.4/lib -L/opt/openmpi_gcc-1.5.4/lib -lmpi_f90 > -lmpi_f77 -lmpi -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl > > marcusmae@loveland:~/Programming/mpitest$ mpif90 -v > Using built-in specs. > COLLECT_GCC=/usr/bin/gfortran > COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6.1/lto-wrapper > Target: x86_64-linux-gnu > Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro > 4.6.1-9ubuntu3' > --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs > --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr > --program-suffix=-4.6 --enable-shared --enable-linker-build-id > --with-system-zlib --libexecdir=/usr/lib --without-included-gettext > --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 > --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu > --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin > --enable-objc-gc --disable-werror --with-arch-32=i686 > --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu > --host=x86_64-linux-gnu --target=x86_64-linux-gnu > Thread model: posix > gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) > > 2011/9/28 Dmitry N. Mikushin : >> Hi, >> >> Interestingly, the errors are gone after I removed "-g" from the app >> compile options. >> >> I tested again on the fresh Ubuntu 11.10 install: both 1.4.3 and 1.5.4 >> compile fine, but with the same error. >> Also I tried hard to find any 32-bit object or library and failed. >> They all are 64-bit. >> >> - D. >> >> 2011/9/24 Jeff Squyres : >>> Check the output from when you ran Open MPI's configure and "make all" -- >>> did it decide to build the F77 interface? >>> >>> Also check that gcc and gfortran output .o files of the same bitness / type. >>> >>> >>> On Sep 24, 2011, at 8:07 AM, Dmitry N. Mikushin wrote: >>> >>>> Compile and link - yes, but it turns out there was some unnoticed >>>> compilation error because >>>> >>>> ./hellompi: error while loading shared libraries: libmpi_f77.so.1: >>>> cannot open shared object file: No such
Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*
Hi, Here's a reprocase, the same one as mentioned here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=608901 marcusmae@loveland:~/Programming/mpitest$ cat mpitest.f90 program main include 'mpif.h' integer ierr call mpi_init(ierr) end marcusmae@loveland:~/Programming/mpitest$ mpif90 -g mpitest.f90 /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x542): unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_argv_null_' /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x55c): unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_argv_null_' /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x5d2): unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_errcodes_ignore_' /usr/bin/ld: /tmp/cc3NLduM.o(.debug_info+0x5ec): unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_errcodes_ignore_' Remove "-g", and the error will be gone. marcusmae@loveland:~/Programming/mpitest$ mpif90 --showme -g mpitest.f90 gfortran -g mpitest.f90 -I/opt/openmpi_gcc-1.5.4/include -pthread -I/opt/openmpi_gcc-1.5.4/lib -L/opt/openmpi_gcc-1.5.4/lib -lmpi_f90 -lmpi_f77 -lmpi -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl marcusmae@loveland:~/Programming/mpitest$ mpif90 -v Using built-in specs. COLLECT_GCC=/usr/bin/gfortran COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6.1/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.1-9ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) 2011/9/28 Dmitry N. Mikushin : > Hi, > > Interestingly, the errors are gone after I removed "-g" from the app > compile options. > > I tested again on the fresh Ubuntu 11.10 install: both 1.4.3 and 1.5.4 > compile fine, but with the same error. > Also I tried hard to find any 32-bit object or library and failed. > They all are 64-bit. > > - D. > > 2011/9/24 Jeff Squyres : >> Check the output from when you ran Open MPI's configure and "make all" -- >> did it decide to build the F77 interface? >> >> Also check that gcc and gfortran output .o files of the same bitness / type. >> >> >> On Sep 24, 2011, at 8:07 AM, Dmitry N. Mikushin wrote: >> >>> Compile and link - yes, but it turns out there was some unnoticed >>> compilation error because >>> >>> ./hellompi: error while loading shared libraries: libmpi_f77.so.1: >>> cannot open shared object file: No such file or directory >>> >>> and this library does not exist. >>> >>> Hm. >>> >>> 2011/9/24 Jeff Squyres : >>>> Can you compile / link simple OMPI applications without this problem? >>>> >>>> On Sep 24, 2011, at 7:54 AM, Dmitry N. Mikushin wrote: >>>> >>>>> Hi Jeff, >>>>> >>>>> Today I've verified this application on the Feroda 15 x86_64, where >>>>> I'm usually building OpenMPI from source using the same method. >>>>> Result: no link errors there! So, the issue is likely ubuntu-specific. >>>>> >>>>> Target application is compiled linked with mpif90 pointing to >>>>> /opt/openmpi_gcc-1.5.4/bin/mpif90 I built. >>>>> >>>>> Regarding architectures, everything in target folders and OpenMPI >>>>> installation is >>>>> ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically >>>>> linked, not stripped >>>>> >>>>> - D. >>>>> >>>>> 2011/9/24 Jeff Squyres : >>>>>> How does the target application compile / link itself? >>>>>> >>>>>> Try running "file" on the Open MPI libraries and/or your target >>>>>> application .o files to see what their bitness is, etc. >>>>>> >>>>>> >>>>>> On Sep 22, 2011, at 3:15 PM, Dmitry N. Mikushin wrote: >>>>>> >>>>>>> Hi Jeff, >>>>>>> >>>>>>> You're right because I
Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*
Hi, Interestingly, the errors are gone after I removed "-g" from the app compile options. I tested again on the fresh Ubuntu 11.10 install: both 1.4.3 and 1.5.4 compile fine, but with the same error. Also I tried hard to find any 32-bit object or library and failed. They all are 64-bit. - D. 2011/9/24 Jeff Squyres : > Check the output from when you ran Open MPI's configure and "make all" -- did > it decide to build the F77 interface? > > Also check that gcc and gfortran output .o files of the same bitness / type. > > > On Sep 24, 2011, at 8:07 AM, Dmitry N. Mikushin wrote: > >> Compile and link - yes, but it turns out there was some unnoticed >> compilation error because >> >> ./hellompi: error while loading shared libraries: libmpi_f77.so.1: >> cannot open shared object file: No such file or directory >> >> and this library does not exist. >> >> Hm. >> >> 2011/9/24 Jeff Squyres : >>> Can you compile / link simple OMPI applications without this problem? >>> >>> On Sep 24, 2011, at 7:54 AM, Dmitry N. Mikushin wrote: >>> >>>> Hi Jeff, >>>> >>>> Today I've verified this application on the Feroda 15 x86_64, where >>>> I'm usually building OpenMPI from source using the same method. >>>> Result: no link errors there! So, the issue is likely ubuntu-specific. >>>> >>>> Target application is compiled linked with mpif90 pointing to >>>> /opt/openmpi_gcc-1.5.4/bin/mpif90 I built. >>>> >>>> Regarding architectures, everything in target folders and OpenMPI >>>> installation is >>>> ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically >>>> linked, not stripped >>>> >>>> - D. >>>> >>>> 2011/9/24 Jeff Squyres : >>>>> How does the target application compile / link itself? >>>>> >>>>> Try running "file" on the Open MPI libraries and/or your target >>>>> application .o files to see what their bitness is, etc. >>>>> >>>>> >>>>> On Sep 22, 2011, at 3:15 PM, Dmitry N. Mikushin wrote: >>>>> >>>>>> Hi Jeff, >>>>>> >>>>>> You're right because I also tried 1.4.3, and it's the same issue >>>>>> there. But what could be wrong? I'm using the simplest form - >>>>>> ../configure --prefix=/opt/openmpi_gcc-1.4.3/ and only installed >>>>>> compilers are system-default gcc and gfortran 4.6.1. Distro is ubuntu >>>>>> 11.10. There is no any mpi installed from packages, and no -m32 >>>>>> options around. What else could be the source? >>>>>> >>>>>> Thanks, >>>>>> - D. >>>>>> >>>>>> 2011/9/22 Jeff Squyres : >>>>>>> This usually means that you're mixing compiler/linker flags somehow >>>>>>> (e.g., built something with 32 bit, built something else with 64 bit, >>>>>>> try to link them together). >>>>>>> >>>>>>> Can you verify that everything was built with all the same 32/64? >>>>>>> >>>>>>> >>>>>>> On Sep 22, 2011, at 1:21 PM, Dmitry N. Mikushin wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives >>>>>>>> a load of linker messages like this one: >>>>>>>> >>>>>>>> /usr/bin/ld: >>>>>>>> ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d): >>>>>>>> unresolvable R_X86_64_64 relocation against symbol >>>>>>>> `mpi_fortran_argv_null_ >>>>>>>> >>>>>>>> There are a lot of similar messages about other mpi_fortran_ symbols. >>>>>>>> Is it a known issue? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> - D. >>>>>>>> ___ >>>>>>>> users mailing list >>>>>>>> us...@open-mpi.org >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Jeff Sq
Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*
Compile and link - yes, but it turns out there was some unnoticed compilation error because ./hellompi: error while loading shared libraries: libmpi_f77.so.1: cannot open shared object file: No such file or directory and this library does not exist. Hm. 2011/9/24 Jeff Squyres : > Can you compile / link simple OMPI applications without this problem? > > On Sep 24, 2011, at 7:54 AM, Dmitry N. Mikushin wrote: > >> Hi Jeff, >> >> Today I've verified this application on the Feroda 15 x86_64, where >> I'm usually building OpenMPI from source using the same method. >> Result: no link errors there! So, the issue is likely ubuntu-specific. >> >> Target application is compiled linked with mpif90 pointing to >> /opt/openmpi_gcc-1.5.4/bin/mpif90 I built. >> >> Regarding architectures, everything in target folders and OpenMPI >> installation is >> ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically >> linked, not stripped >> >> - D. >> >> 2011/9/24 Jeff Squyres : >>> How does the target application compile / link itself? >>> >>> Try running "file" on the Open MPI libraries and/or your target application >>> .o files to see what their bitness is, etc. >>> >>> >>> On Sep 22, 2011, at 3:15 PM, Dmitry N. Mikushin wrote: >>> >>>> Hi Jeff, >>>> >>>> You're right because I also tried 1.4.3, and it's the same issue >>>> there. But what could be wrong? I'm using the simplest form - >>>> ../configure --prefix=/opt/openmpi_gcc-1.4.3/ and only installed >>>> compilers are system-default gcc and gfortran 4.6.1. Distro is ubuntu >>>> 11.10. There is no any mpi installed from packages, and no -m32 >>>> options around. What else could be the source? >>>> >>>> Thanks, >>>> - D. >>>> >>>> 2011/9/22 Jeff Squyres : >>>>> This usually means that you're mixing compiler/linker flags somehow >>>>> (e.g., built something with 32 bit, built something else with 64 bit, try >>>>> to link them together). >>>>> >>>>> Can you verify that everything was built with all the same 32/64? >>>>> >>>>> >>>>> On Sep 22, 2011, at 1:21 PM, Dmitry N. Mikushin wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives >>>>>> a load of linker messages like this one: >>>>>> >>>>>> /usr/bin/ld: >>>>>> ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d): >>>>>> unresolvable R_X86_64_64 relocation against symbol >>>>>> `mpi_fortran_argv_null_ >>>>>> >>>>>> There are a lot of similar messages about other mpi_fortran_ symbols. >>>>>> Is it a known issue? >>>>>> >>>>>> Thanks, >>>>>> - D. >>>>>> ___ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> >>>>> -- >>>>> Jeff Squyres >>>>> jsquy...@cisco.com >>>>> For corporate legal information go to: >>>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>>> >>>>> >>>>> ___ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>> ___ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*
Hi Jeff, Today I've verified this application on the Feroda 15 x86_64, where I'm usually building OpenMPI from source using the same method. Result: no link errors there! So, the issue is likely ubuntu-specific. Target application is compiled linked with mpif90 pointing to /opt/openmpi_gcc-1.5.4/bin/mpif90 I built. Regarding architectures, everything in target folders and OpenMPI installation is ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped - D. 2011/9/24 Jeff Squyres : > How does the target application compile / link itself? > > Try running "file" on the Open MPI libraries and/or your target application > .o files to see what their bitness is, etc. > > > On Sep 22, 2011, at 3:15 PM, Dmitry N. Mikushin wrote: > >> Hi Jeff, >> >> You're right because I also tried 1.4.3, and it's the same issue >> there. But what could be wrong? I'm using the simplest form - >> ../configure --prefix=/opt/openmpi_gcc-1.4.3/ and only installed >> compilers are system-default gcc and gfortran 4.6.1. Distro is ubuntu >> 11.10. There is no any mpi installed from packages, and no -m32 >> options around. What else could be the source? >> >> Thanks, >> - D. >> >> 2011/9/22 Jeff Squyres : >>> This usually means that you're mixing compiler/linker flags somehow (e.g., >>> built something with 32 bit, built something else with 64 bit, try to link >>> them together). >>> >>> Can you verify that everything was built with all the same 32/64? >>> >>> >>> On Sep 22, 2011, at 1:21 PM, Dmitry N. Mikushin wrote: >>> >>>> Hi, >>>> >>>> OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives >>>> a load of linker messages like this one: >>>> >>>> /usr/bin/ld: ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d): >>>> unresolvable R_X86_64_64 relocation against symbol >>>> `mpi_fortran_argv_null_ >>>> >>>> There are a lot of similar messages about other mpi_fortran_ symbols. >>>> Is it a known issue? >>>> >>>> Thanks, >>>> - D. >>>> ___ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*
Hi Jeff, You're right because I also tried 1.4.3, and it's the same issue there. But what could be wrong? I'm using the simplest form - ../configure --prefix=/opt/openmpi_gcc-1.4.3/ and only installed compilers are system-default gcc and gfortran 4.6.1. Distro is ubuntu 11.10. There is no any mpi installed from packages, and no -m32 options around. What else could be the source? Thanks, - D. 2011/9/22 Jeff Squyres : > This usually means that you're mixing compiler/linker flags somehow (e.g., > built something with 32 bit, built something else with 64 bit, try to link > them together). > > Can you verify that everything was built with all the same 32/64? > > > On Sep 22, 2011, at 1:21 PM, Dmitry N. Mikushin wrote: > >> Hi, >> >> OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives >> a load of linker messages like this one: >> >> /usr/bin/ld: ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d): >> unresolvable R_X86_64_64 relocation against symbol >> `mpi_fortran_argv_null_ >> >> There are a lot of similar messages about other mpi_fortran_ symbols. >> Is it a known issue? >> >> Thanks, >> - D. >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*
Same error when configured with --with-pic --with-gnu-ld 2011/9/22 Dmitry N. Mikushin : > Hi, > > OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives > a load of linker messages like this one: > > /usr/bin/ld: ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d): > unresolvable R_X86_64_64 relocation against symbol > `mpi_fortran_argv_null_ > > There are a lot of similar messages about other mpi_fortran_ symbols. > Is it a known issue? > > Thanks, > - D. >
[OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*
Hi, OpenMPI 1.5.4 compiled with gcc 4.6.1 and linked with target app gives a load of linker messages like this one: /usr/bin/ld: ../../lib/libutil.a(parallel_utilities.o)(.debug_info+0x529d): unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_argv_null_ There are a lot of similar messages about other mpi_fortran_ symbols. Is it a known issue? Thanks, - D.
Re: [OMPI users] Compiling both 32-bit and 64-bit?
Thanks, Brian, I'm trying to follow the guide for 1.5.4, not yet clear what's wrong: [marcusmae@zacate build32]$ ../configure --prefix=/opt/openmpi_kgen-1.5.4 --includedir=/opt/openmpi_kgen-1.5.4/include/32 --libdir=/opt/openmpi_kgen-1.5.4/lib32 --build=x86_64-unknown-linux --host=x86_64-unknown-linux --target=i686-unknown-linux --disable-binaries ... configure: WARNING: *** The Open MPI configure script does not support --program-prefix, --program-suffix or --program-transform-name. Users are recommended to instead use --prefix with a unique directory and make symbolic links as desired for renaming. configure: error: *** Cannot continue [marcusmae@zacate build32]$ ../configure --prefix=/opt/openmpi_kgen-1.5.4 --includedir=/opt/openmpi_kgen-1.5.4/include/32 --libdir=/opt/openmpi_kgen-1.5.4/lib32 --build=x86_64-unknown-linux --host=i686-unknown-linux --disable-binaries ... checking gfortran external symbol convention... link: invalid option -- 'd' Try `link --help' for more information. link: invalid option -- 'd' Try `link --help' for more information. link: invalid option -- 'd' Try `link --help' for more information. link: invalid option -- 'd' Try `link --help' for more information. link: invalid option -- 'd' Try `link --help' for more information. configure: error: unknown naming convention: 2011/8/24 Barrett, Brian W : > On 8/24/11 11:29 AM, "Dmitry N. Mikushin" wrote: > >>Quick question: is there an easy switch to compile and install both >>32-bit and 64-bit OpenMPI libraries into a single tree? E.g. 64-bit in >>/prefix/lib64 and 32-bit in /prefix/lib. > > Quick answer: not easily. > > Long answer: There's not an easy way, but there are some facilities to > help. I believe Oracle uses them when building binaries for Solaris. > There is some documentation available on our Trac wiki: > > https://svn.open-mpi.org/trac/ompi/wiki/MultiLib > https://svn.open-mpi.org/trac/ompi/wiki/compilerwrapper3264 > > The difficulty is that it's up to the user/admin to make sure the correct > arguments are provided, as well as writing the wrapper script files to do > the sharing. > > Brian > > -- > Brian W. Barrett > Dept. 1423: Scalable System Software > Sandia National Laboratories > > > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
[OMPI users] Compiling both 32-bit and 64-bit?
Hi, Quick question: is there an easy switch to compile and install both 32-bit and 64-bit OpenMPI libraries into a single tree? E.g. 64-bit in /prefix/lib64 and 32-bit in /prefix/lib. Thanks, - D.
Re: [OMPI users] OpenMPI causing WRF to crash
BasitAli, Signal 15 apparently means one of the WRF's MPI processes has been unexpectedly terminated, maybe by program decision. No matter, if it is OpenMPI-specific or not, issue needs to be tracked somehow to get more details about it. Ideally, best thing is to get debugger attached once the process signaled, then you can see call trace and figure out what exactly has happened. This can be done by registering a custom signal handler (see unix documentation for signals) or by running MPI processes inside external diagnostic tool, for example valgrind: mpirun -np valgrind --db-attach=yes ./appname ... or by consulting with WRF community to check if they already have configured some other approach. Good luck with resolving this case! - D. 2011/8/3 BasitAli Khan : > I am trying to run a rather heavy wrf simulation with spectral nudging but > the simulation crashes after 1.8 minutes of integration. > The simulation has two domains with d01 = 601x601 and d02 = 721x721 and > 51 vertical levels. I tried this simulation on two different systems but > result was more or less same. For example > On our Bluegene/P with SUSE Linux Enterprise Server 10 ppc and XLF > compiler I tried to run wrf on 2048 shared memory nodes (1 compute node = 4 > cores , 32 bit, 850 Mhz). For the parallel run I used mpixlc, mpixlcxx and > mpixlf90. I got the following error message in the wrf.err file > BE_MPI (ERROR): The error message in the job > record is as follows: > BE_MPI (ERROR): "killed with signal 15" > I also tried to run the same simulation on our linux cluster (Linux Red Hat > Enterprise 5.4m x86_64 and Intel compiler) with 8, 16 and 64 nodes (1 > compute node=8 cores). For the parallel run I am > used mpi/openmpi/1.4.2-intel-11. I got the following error message in the > error log after couple of minutes of integration. > "mpirun has exited due to process rank 45 with PID 19540 on > node ci118 exiting without calling "finalize". This may > have caused other processes in the application to be > terminated by signals sent by mpirun (as reported here)." > I tried many things but nothing seems to be working. However, if I reduce > grid points below 200, the simulation goes fine. It appears that probably > OpenMP has problem with large number of grid points but I have no idea how > to fix it. I will greatly appreciate if you could suggest some solution. > Best regards, > --- > Basit A. Khan, Ph.D. > Postdoctoral Fellow > Division of Physical Sciences & Engineering > Office# 3204, Level 3, Building 1, > King Abdullah University of Science & Technology > 4700 King Abdullah Blvd, Box 2753, Thuwal 23955 –6900, > Kingdom of Saudi Arabia. > Office: +966(0)2 808 0276, Mobile: +966(0)5 9538 7592 > E-mail: basitali.k...@kaust.edu.sa > Skype name: basit.a.khan > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Error installing OpenMPI 1.5.3
Sorry, disregard this, the issue was created by my own buggy compiler wrapper. - D. 2011/7/10 Dmitry N. Mikushin : > Hi, > > Maybe it would be useful to report the openmpi 1.5.3 archive currently > has a strange issue when installing on Fedora 15 x86_64 (gcc 4.6), > that *does not* happen with 1.4.3: > > $ ../configure --prefix=/opt/openmpi_kgen-1.5.3 CC=gcc CXX=g++ > F77=gfortran FC=gfortran > > ... > > $ sudo make install > > ... > > make[5]: Entering directory > `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90' > test -z "/opt/openmpi_kgen-1.5.3/lib" || /bin/mkdir -p > "/opt/openmpi_kgen-1.5.3/lib" > /bin/sh ../../../libtool --mode=install /usr/bin/install -c > libmpi_f90.la '/opt/openmpi_kgen-1.5.3/lib' > libtool: install: warning: relinking `libmpi_f90.la' > libtool: install: (cd > /home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90; /bin/sh > /home/marcusmae/Programming/openmpi-1.5.3/build/libtool --silent > --tag FC --mode=relink /usr/bin/gfortran -I../../../ompi/include > -I../../../../ompi/include -I. -I../../../../ompi/mpi/f90 > -I../../../ompi/mpi/f90 -version-info 1:1:0 -export-dynamic -o > libmpi_f90.la -rpath /opt/openmpi_kgen-1.5.3/lib mpi.lo mpi_sizeof.lo > mpi_comm_spawn_multiple_f90.lo mpi_testall_f90.lo mpi_testsome_f90.lo > mpi_waitall_f90.lo mpi_waitsome_f90.lo mpi_wtick_f90.lo > mpi_wtime_f90.lo ../../../ompi/mpi/f77/libmpi_f77.la -lnsl -lutil -lm > ) > mv: cannot stat `libmpi_f90.so.1.0.1': No such file or directory > libtool: install: error: relink `libmpi_f90.la' with the above command > before installing it > make[5]: *** [install-libLTLIBRARIES] Error 1 > make[5]: Leaving directory > `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90' > make[4]: *** [install-am] Error 2 > make[4]: Leaving directory > `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90' > make[3]: *** [install-recursive] Error 1 > make[3]: Leaving directory > `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90' > make[2]: *** [install] Error 2 > make[2]: Leaving directory > `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90' > make[1]: *** [install-recursive] Error 1 > make[1]: Leaving directory > `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi' > make: *** [install-recursive] Error 1 > > Is it a known problem? > > Thanks, > - D. >
[OMPI users] Error installing OpenMPI 1.5.3
Hi, Maybe it would be useful to report the openmpi 1.5.3 archive currently has a strange issue when installing on Fedora 15 x86_64 (gcc 4.6), that *does not* happen with 1.4.3: $ ../configure --prefix=/opt/openmpi_kgen-1.5.3 CC=gcc CXX=g++ F77=gfortran FC=gfortran ... $ sudo make install ... make[5]: Entering directory `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90' test -z "/opt/openmpi_kgen-1.5.3/lib" || /bin/mkdir -p "/opt/openmpi_kgen-1.5.3/lib" /bin/sh ../../../libtool --mode=install /usr/bin/install -c libmpi_f90.la '/opt/openmpi_kgen-1.5.3/lib' libtool: install: warning: relinking `libmpi_f90.la' libtool: install: (cd /home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90; /bin/sh /home/marcusmae/Programming/openmpi-1.5.3/build/libtool --silent --tag FC --mode=relink /usr/bin/gfortran -I../../../ompi/include -I../../../../ompi/include -I. -I../../../../ompi/mpi/f90 -I../../../ompi/mpi/f90 -version-info 1:1:0 -export-dynamic -o libmpi_f90.la -rpath /opt/openmpi_kgen-1.5.3/lib mpi.lo mpi_sizeof.lo mpi_comm_spawn_multiple_f90.lo mpi_testall_f90.lo mpi_testsome_f90.lo mpi_waitall_f90.lo mpi_waitsome_f90.lo mpi_wtick_f90.lo mpi_wtime_f90.lo ../../../ompi/mpi/f77/libmpi_f77.la -lnsl -lutil -lm ) mv: cannot stat `libmpi_f90.so.1.0.1': No such file or directory libtool: install: error: relink `libmpi_f90.la' with the above command before installing it make[5]: *** [install-libLTLIBRARIES] Error 1 make[5]: Leaving directory `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90' make[4]: *** [install-am] Error 2 make[4]: Leaving directory `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90' make[3]: *** [install-recursive] Error 1 make[3]: Leaving directory `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90' make[2]: *** [install] Error 2 make[2]: Leaving directory `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi/mpi/f90' make[1]: *** [install-recursive] Error 1 make[1]: Leaving directory `/home/marcusmae/Programming/openmpi-1.5.3/build/ompi' make: *** [install-recursive] Error 1 Is it a known problem? Thanks, - D.
Re: [OMPI users] mpif90 compiler non-functional
Alexandre, > How can I make to point to the new installed version in > /opt/openmpi-1.4.3, when calling mpif90 or mpif77 as a common user ? If you need to switch between multiple working MPI implementations frequently (a common problem on public clusters or during local testing/benchmarking), scripts like mpi-selector can be very handy. First you register all possible variants with mpi-selector --register , and then you can switch current with mpi-selector --set name (and restart shell). Technically, it does the same thing already mentioned - adding records to $PATH and LD_LIBRARY_PATH. Script is part of redhat distros (and is written by Jeff, I suppose), but you can easily rebuild its source rpm for your system or convert it with alien if you are on ubuntu (works for me). - D. 2011/6/22 Alexandre Souza : > Thanks Dimitri and Jeff for the output, > I managed build the mpi and run the examples in f77 and f90 doing the > guideline. > However the only problem is I was logged as Root. > When I compile the examples with mpif90 or mpif77 as common user, it > keeps pointing to the old installation of mpi that does not use the > fortran compiler. > (/home/amscosta/OpenFOAM/ThirdParty-1.7.x/platforms/linuxGcc/openmpi-1.4.1) > How can I make to point to the new installed version in > /opt/openmpi-1.4.3, when calling mpif90 or mpif77 as a common user ? > Alex > > On Wed, Jun 22, 2011 at 1:49 PM, Jeff Squyres wrote: >> Dimitry is correct -- if OMPI's configure can find a working C++ and Fortran >> compiler, it'll build C++ / Fortran support. Yours was not, indicating that: >> >> a) you got a binary distribution from someone who didn't include C++ / >> Fortran support, or >> >> b) when you built/installed Open MPI, it couldn't find a working C++ / >> Fortran compiler, so it skipped building support for them. >> >> >> >> On Jun 22, 2011, at 12:05 PM, Dmitry N. Mikushin wrote: >> >>> Here's mine produced from default compilation: >>> >>> Package: Open MPI marcusmae@T61p Distribution >>> Open MPI: 1.4.4rc2 >>> Open MPI SVN revision: r24683 >>> Open MPI release date: May 05, 2011 >>> Open RTE: 1.4.4rc2 >>> Open RTE SVN revision: r24683 >>> Open RTE release date: May 05, 2011 >>> OPAL: 1.4.4rc2 >>> OPAL SVN revision: r24683 >>> OPAL release date: May 05, 2011 >>> Ident string: 1.4.4rc2 >>> Prefix: /opt/openmpi_gcc-1.4.4 >>> Configured architecture: x86_64-unknown-linux-gnu >>> Configure host: T61p >>> Configured by: marcusmae >>> Configured on: Tue May 24 18:39:21 MSD 2011 >>> Configure host: T61p >>> Built by: marcusmae >>> Built on: Tue May 24 18:46:52 MSD 2011 >>> Built host: T61p >>> C bindings: yes >>> C++ bindings: yes >>> Fortran77 bindings: yes (all) >>> Fortran90 bindings: yes >>> Fortran90 bindings size: small >>> C compiler: gcc >>> C compiler absolute: /usr/bin/gcc >>> C++ compiler: g++ >>> C++ compiler absolute: /usr/bin/g++ >>> Fortran77 compiler: gfortran >>> Fortran77 compiler abs: /usr/bin/gfortran >>> Fortran90 compiler: gfortran >>> Fortran90 compiler abs: /usr/bin/gfortran >>> >>> gfortran version is: >>> >>> gcc version 4.6.0 20110530 (Red Hat 4.6.0-9) (GCC) >>> >>> How do you run ./configure? Maybe try "./configure >>> FC=/usr/bin/gfortran" ? It should really really work out of box >>> though. Configure scripts usually cook some simple test apps and run >>> them to check if compiler works properly. So, your ./configure output >>> may help to understand more. >>> >>> - D. >>> >>> 2011/6/22 Alexandre Souza : >>>> Hi Dimitri, >>>> Thanks for the reply. >>>> I have openmpi installed before for another application in : >>>> /home/amscosta/OpenFOAM/ThirdParty-1.7.x/platforms/linuxGcc/openmpi-1.4.1 >>>> I installed a new version in /opt/openmpi-1.4.3. >>>> I reproduce some output from the screen : >>>> amscosta@amscosta-desktop:/opt/openmpi-1.4.3/bin$ ompi_info >>>> Package: Open MPI amscosta@amscosta-desktop Distribution >>>> Open MPI: 1.4.1 >>>> Open MPI SVN revision: r22421 >>>>
Re: [OMPI users] mpif90 compiler non-functional
Here's mine produced from default compilation: Package: Open MPI marcusmae@T61p Distribution Open MPI: 1.4.4rc2 Open MPI SVN revision: r24683 Open MPI release date: May 05, 2011 Open RTE: 1.4.4rc2 Open RTE SVN revision: r24683 Open RTE release date: May 05, 2011 OPAL: 1.4.4rc2 OPAL SVN revision: r24683 OPAL release date: May 05, 2011 Ident string: 1.4.4rc2 Prefix: /opt/openmpi_gcc-1.4.4 Configured architecture: x86_64-unknown-linux-gnu Configure host: T61p Configured by: marcusmae Configured on: Tue May 24 18:39:21 MSD 2011 Configure host: T61p Built by: marcusmae Built on: Tue May 24 18:46:52 MSD 2011 Built host: T61p C bindings: yes C++ bindings: yes Fortran77 bindings: yes (all) Fortran90 bindings: yes Fortran90 bindings size: small C compiler: gcc C compiler absolute: /usr/bin/gcc C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fortran77 compiler: gfortran Fortran77 compiler abs: /usr/bin/gfortran Fortran90 compiler: gfortran Fortran90 compiler abs: /usr/bin/gfortran gfortran version is: gcc version 4.6.0 20110530 (Red Hat 4.6.0-9) (GCC) How do you run ./configure? Maybe try "./configure FC=/usr/bin/gfortran" ? It should really really work out of box though. Configure scripts usually cook some simple test apps and run them to check if compiler works properly. So, your ./configure output may help to understand more. - D. 2011/6/22 Alexandre Souza : > Hi Dimitri, > Thanks for the reply. > I have openmpi installed before for another application in : > /home/amscosta/OpenFOAM/ThirdParty-1.7.x/platforms/linuxGcc/openmpi-1.4.1 > I installed a new version in /opt/openmpi-1.4.3. > I reproduce some output from the screen : > amscosta@amscosta-desktop:/opt/openmpi-1.4.3/bin$ ompi_info > Package: Open MPI amscosta@amscosta-desktop Distribution > Open MPI: 1.4.1 > Open MPI SVN revision: r22421 > Open MPI release date: Jan 14, 2010 > Open RTE: 1.4.1 > Open RTE SVN revision: r22421 > Open RTE release date: Jan 14, 2010 > OPAL: 1.4.1 > OPAL SVN revision: r22421 > OPAL release date: Jan 14, 2010 > Ident string: 1.4.1 > Prefix: > /home/amscosta/OpenFOAM/ThirdParty-1.7.x/platforms/linuxGcc/openmpi-1.4.1 > Configured architecture: i686-pc-linux-gnu > Configure host: amscosta-desktop > Configured by: amscosta > Configured on: Wed May 18 11:10:14 BRT 2011 > Configure host: amscosta-desktop > Built by: amscosta > Built on: Wed May 18 11:16:21 BRT 2011 > Built host: amscosta-desktop > C bindings: yes > C++ bindings: no > Fortran77 bindings: no > Fortran90 bindings: no > Fortran90 bindings size: na > C compiler: gcc > C compiler absolute: /usr/bin/gcc > C++ compiler: g++ > C++ compiler absolute: /usr/bin/g++ > Fortran77 compiler: gfortran > Fortran77 compiler abs: /usr/bin/gfortran > Fortran90 compiler: none > Fortran90 compiler abs: none > C profiling: no > C++ profiling: no > Fortran77 profiling: no > Fortran90 profiling: no > C++ exceptions: no > Thread support: posix (mpi: no, progress: no) > Sparse Groups: no > Internal debug support: no > MPI parameter check: runtime > Memory profiling support: no > Memory debugging support: no > libltdl support: yes > Heterogeneous support: no > mpirun default --prefix: no > MPI I/O support: yes > MPI_WTIME support: gettimeofday > Symbol visibility support: yes > .. > > > On Wed, Jun 22, 2011 at 12:34 PM, Dmitry N. Mikushin > wrote: >> Alexandre, >> >> Did you have a working Fortran compiler in system in time of OpenMPI >> compilation? To my experience Fortran bindings are always compiled by >> default. How did you configured it and have you noticed any messages >> reg. Fortran support in configure output? >> >> - D. >> >> 2011/6/22 Alexandre Souza : >>> Dear Group, >>> After compiling the openmpi source, the following message is displayed >>> when trying to compile >>> the hello program in fortran : >>> amscosta@amscosta-desktop:~/openmpi-1.4.3/examples$ >>> /opt/openmpi-1.4.3/bin/mpif90 -g hello_f90.f90 -o hello_f90 >>> --
Re: [OMPI users] mpif90 compiler non-functional
Alexandre, Did you have a working Fortran compiler in system in time of OpenMPI compilation? To my experience Fortran bindings are always compiled by default. How did you configured it and have you noticed any messages reg. Fortran support in configure output? - D. 2011/6/22 Alexandre Souza : > Dear Group, > After compiling the openmpi source, the following message is displayed > when trying to compile > the hello program in fortran : > amscosta@amscosta-desktop:~/openmpi-1.4.3/examples$ > /opt/openmpi-1.4.3/bin/mpif90 -g hello_f90.f90 -o hello_f90 > -- > Unfortunately, this installation of Open MPI was not compiled with > Fortran 90 support. As such, the mpif90 compiler is non-functional. > -- > Any clue how to solve it is very welcome. > Thanks, > Alex > P.S. I am using a ubuntu box with gfortran > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] USE mpi
Oh, clear now, thank you! 2011/5/8 Steph Bredenhann > Jeff is correct. The Intel environmental variables are either set in > /etc/profile or /user/.bashrc (or manually). Root sets its own environmental > variables and therefore the key is to make sure that the environmental > variables are set before an installation as root is done, i.e.: > > source /opt/intel/Compiler/11.1/073/bin/ifortvars.sh intel64 > source /opt/intel/Compiler/11.1/073/bin/iccvars.sh intel64 > > Then the rest of the procedure can follow. > > It sounds simple and it is, perhaps > > -- > Steph Bredenhann > > On Sun, 2011-05-08 at 09:09 -0400, Jeff Squyres (jsquyres) wrote: > > Make all gets the same environment as make install (assuming you do it in the > same shell). But if you sudo make install, the environment may be different - > it may not inherit everything from your environment. > > I advised the user to "sudo -s" and ten setup the compiler environment and > then run make install. > > Sent from my phone. No type good. > > On May 7, 2011, at 9:37 PM, "Dmitry N. Mikushin" wrote: > > > Tim, > > > > I certainly do not expect anything special, just normally "make > > install" should not have issues, if "make" passes fine, right? What we > > have with OpenMPI is this strange difference: if ./configure CC=icc, > > "make" works, and "make install" - does not; if ./configure > > CC=/full/path/to/icc, then both "make" and "make install" work. > > Nothing needs to be searched, icc is already in PATH, since > > compilevars are sourced in profile.d. Or am I missing something? > > > > Thanks, > > - D. > > > > 2011/5/8 Tim Prince : > >> On 5/7/2011 2:35 PM, Dmitry N. Mikushin wrote: > >>>> > >>>> didn't find the icc compiler > >>> > >>> Jeff, on 1.4.3 I saw the same issue, even more generally: "make > >>> install" cannot find the compiler, if it is an alien compiler (i.e. > >>> not the default gcc) - same situation for intel or llvm, for example. > >>> The workaround is to specify full paths to compilers with CC=... > >>> FC=... in ./configure params. Could it be "make install" breaks some > >>> env paths? > >>> > >> > >> Most likely reason for not finding an installed icc is that the icc > >> environment (source the compilervars script if you have a current version) > >> wasn't set prior to running configure. Setting up the compiler in question > >> in accordance with its own instructions is a more likely solution than the > >> absolute path choice. > >> OpenMPI configure, for good reason, doesn't search your system to see where > >> a compiler might be installed. What if you had 2 versions of the same > >> named > >> compiler? > >> -- > >> Tim Prince > >> ___ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing > listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] USE mpi
Tim, I certainly do not expect anything special, just normally "make install" should not have issues, if "make" passes fine, right? What we have with OpenMPI is this strange difference: if ./configure CC=icc, "make" works, and "make install" - does not; if ./configure CC=/full/path/to/icc, then both "make" and "make install" work. Nothing needs to be searched, icc is already in PATH, since compilevars are sourced in profile.d. Or am I missing something? Thanks, - D. 2011/5/8 Tim Prince : > On 5/7/2011 2:35 PM, Dmitry N. Mikushin wrote: >>> >>> didn't find the icc compiler >> >> Jeff, on 1.4.3 I saw the same issue, even more generally: "make >> install" cannot find the compiler, if it is an alien compiler (i.e. >> not the default gcc) - same situation for intel or llvm, for example. >> The workaround is to specify full paths to compilers with CC=... >> FC=... in ./configure params. Could it be "make install" breaks some >> env paths? >> > > Most likely reason for not finding an installed icc is that the icc > environment (source the compilervars script if you have a current version) > wasn't set prior to running configure. Setting up the compiler in question > in accordance with its own instructions is a more likely solution than the > absolute path choice. > OpenMPI configure, for good reason, doesn't search your system to see where > a compiler might be installed. What if you had 2 versions of the same named > compiler? > -- > Tim Prince > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] USE mpi
> didn't find the icc compiler Jeff, on 1.4.3 I saw the same issue, even more generally: "make install" cannot find the compiler, if it is an alien compiler (i.e. not the default gcc) - same situation for intel or llvm, for example. The workaround is to specify full paths to compilers with CC=... FC=... in ./configure params. Could it be "make install" breaks some env paths? - D. 2011/5/8 Jeff Squyres : > We iterated off-list -- the problem was that "sudo make install" didn't find > the icc compiler, and therefore didn't complete properly. > > It seems that the ompi_info and mpif90 cited in this thread were from some > other (broken?) OMPI installation. > > > > On May 7, 2011, at 3:01 PM, Steph Bredenhann wrote: > >> Sorry, I missed the 2nd statement: >> >> Fortran90 bindings: yes >> Fortran90 bindings size: small >> Fortran90 compiler: gfortran >> Fortran90 compiler abs: /usr/bin/gfortran >> Fortran90 profiling: yes >> >> >> -- >> Steph Bredenhann >> >> On Sat, 2011-05-07 at 14:46 -0400, Jeff Squyres wrote: >>> ompi_info | grep 90 >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Help: HPL Problem
Eric, You have a link-time error complaining about the absence of some libraries. At least two of them libm and libdl must be provided by system, not by MPI implementation. Could you locate them in /usr/lib64? Also it should be useful to figure out if the problem is global or specific to HPL: do you have any errors compiling simple "hello world" program with OpenMPI? - D. 2011/5/7 Lee Eric : > Hi, > > I encountered following error messages when I compiled HPL. > > make[2]: Entering directory > `/pool/measure/hpl-2.0/testing/ptest/Linux_PII_FBLAS' > /pool/MPI/openmpi/bin/mpif90 -DAdd__ -DF77_INTEGER=int > -DStringSunStyle -I/pool/measure/hpl-2.0/include > -I/pool/measure/hpl-2.0/include/Linux_PII_FBLAS > -I/pool/MPI/openmpi/include -fomit-frame-pointer -O3 -funroll-loops -W > -Wall -o /pool/measure/hpl-2.0/bin/Linux_PII_FBLAS/xhpl HPL_pddriver.o > HPL_pdinfo.o HPL_pdtest.o > /pool/measure/hpl-2.0/lib/Linux_PII_FBLAS/libhpl.a > /pool/libs/BLAS/blas_LINUX.a /pool/MPI/openmpi/lib/libmpi.so > /usr/bin/ld: cannot find -ldl > /usr/bin/ld: cannot find -lnsl > /usr/bin/ld: cannot find -lutil > /usr/bin/ld: cannot find -lm > /usr/bin/ld: cannot find -ldl > /usr/bin/ld: cannot find -lm > collect2: ld returned 1 exit status > make[2]: *** [dexe.grd] Error 1 > make[2]: Leaving directory > `/pool/measure/hpl-2.0/testing/ptest/Linux_PII_FBLAS' > make[1]: *** [build_tst] Error 2 > make[1]: Leaving directory `/pool/measure/hpl-2.0' > make: *** [build] Error 2 > > And the attachment is the make file I created. OS is Fedora 14 x86_64. > > Could anyone show me where is going wrong? Thanks. > > Eric > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] OpenMPI-PGI: /usr/bin/ld: Warning: size of symbol `#' changed from # in #.o to # in #.so
I checked that this issue is not caused by using different compile options for different libraries. There is a set of libraries and executable compiled with mpif90, and this warning comes for executable's object and one of libraries... 2011/3/25 Dmitry N. Mikushin : > Hi, > > I'm wondering if anybody have seen something similar, and have you > succeeded to run your application compiled by openmpi-pgi-1.4.2 with > the following warnings: > > /usr/bin/ld: Warning: size of symbol `mpi_fortran_errcodes_ignore_' > changed from 4 in foo.o to 8 in lib/libfoolib2.so > /usr/bin/ld: Warning: size of symbol `mpi_fortran_argv_null_' changed > from 1 in foo.o to 8 in lib/libfoolib2.so > /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_message_mod_0_' in > lib/libfoolib1.so is smaller than 32 in foo.o > /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_abort_mod_0_' in > lib/libfoolib1.so is smaller than 32 in foo.o > /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_ioinit_mod_0_' in > lib/libfoolib1.so is smaller than 32 in foo.o > /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_gatherv_mod_6_' in > lib/libfoolib1.so is smaller than 32 in foo.o > > Symbols names look like being internal to OpenMPI, there was one > similar issue in archive back in 2006: > https://svn.open-mpi.org/trac/ompi/changeset/11057 could it be hit > again? > > Thanks, > - D. >
[OMPI users] OpenMPI-PGI: /usr/bin/ld: Warning: size of symbol `#' changed from # in #.o to # in #.so
Hi, I'm wondering if anybody have seen something similar, and have you succeeded to run your application compiled by openmpi-pgi-1.4.2 with the following warnings: /usr/bin/ld: Warning: size of symbol `mpi_fortran_errcodes_ignore_' changed from 4 in foo.o to 8 in lib/libfoolib2.so /usr/bin/ld: Warning: size of symbol `mpi_fortran_argv_null_' changed from 1 in foo.o to 8 in lib/libfoolib2.so /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_message_mod_0_' in lib/libfoolib1.so is smaller than 32 in foo.o /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_abort_mod_0_' in lib/libfoolib1.so is smaller than 32 in foo.o /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_ioinit_mod_0_' in lib/libfoolib1.so is smaller than 32 in foo.o /usr/bin/ld: Warning: alignment 16 of symbol `_mpl_gatherv_mod_6_' in lib/libfoolib1.so is smaller than 32 in foo.o Symbols names look like being internal to OpenMPI, there was one similar issue in archive back in 2006: https://svn.open-mpi.org/trac/ompi/changeset/11057 could it be hit again? Thanks, - D.