[OMPI users] OpenMPI-1.7.3 - cuda support

2013-10-30 Thread KESTENER Pierre
Hello, I'm having problems running a simple cuda-aware mpi application; the one found at https://github.com/parallel-forall/code-samples/tree/master/posts/cuda-aware-mpi-example I have modified symbol ENV_LOCAL_RANK into OMPI_COMM_WORLD_LOCAL_RANK My cluster has 2 K20m GPUs per node, with QLogic

Re: [OMPI users] OpenMPI-1.7.3 - cuda support

2013-10-30 Thread Rolf vandeVaart
Let me try this out and see what happens for me. But yes, please go ahead and send me the complete backtrace. Rolf From: users [mailto:users-boun...@open-mpi.org] On Behalf Of KESTENER Pierre Sent: Wednesday, October 30, 2013 11:34 AM To: us...@open-mpi.org Cc: KESTENER Pierre Subject: [OMPI use

Re: [OMPI users] OpenMPI-1.7.3 - cuda support

2013-10-30 Thread KESTENER Pierre
Dear Rolf, thank for looking into this. Here is the complete backtrace for execution using 2 GPUs on the same node: (cuda-gdb) bt #0 0x7711d885 in raise () from /lib64/libc.so.6 #1 0x7711f065 in abort () from /lib64/libc.so.6 #2 0x70387b8d in psmi_errhandler_psm (ep=,

Re: [OMPI users] OpenMPI-1.7.3 - cuda support

2013-10-30 Thread Rolf vandeVaart
The CUDA-aware support is only available when running with the verbs interface to Infiniband. It does not work with the PSM interface which is being used in your installation. To verify this, you need to disable the usage of PSM. This can be done in a variety of ways, but try running like this

[OMPI users] ofed installation

2013-10-30 Thread Robo Beans
Hello everyone, I am trying to install ofed-1.5.3.2 on centos 6.4 using install.pl provided but getting following error: /lib/modules/2.6.32-358.el6.x86_64/build/scripts is required to build kernel-ib RPM. // info. about current kernel *$ uname -a* Linux scc-10-2-xx-xx-xyz.com 2.6.32-358.el6.x

Re: [OMPI users] ofed installation

2013-10-30 Thread Ralph Castain
Looks to me like that's an error from the OFED installer, not something from OMPI. Have you tried their mailing list? On Oct 30, 2013, at 1:05 PM, Robo Beans wrote: > > Hello everyone, > > I am trying to install ofed-1.5.3.2 on centos 6.4 using install.pl provided > but getting following er

Re: [OMPI users] ofed installation

2013-10-30 Thread Robo Beans
I did try ofed forum: https://www.openfabrics.org/forum/7-installation/882-ofed-1532.html#882 but was wondering if group members faced similar issue as well while installing ofed and what steps they followed to resolve it? Thanks! Robo On Wed, Oct 30, 2013 at 1:22 PM, Ralph Castain wrote: >

[OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-10-30 Thread Jim Parker
Hello, I have recently built a cluster that uses the 64-bit indexing feature of OpenMPI following the directions at http://wiki.chem.vu.nl/dirac/index.php/How_to_build_MPI_libraries_for_64-bit_integers My question is what are the new prototypes for the MPI calls ? specifically MPI_RECV MPI_Allga

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-10-30 Thread Ralph Castain
I believe this has been a long-standing issue with the MPI definitions - they specify "int", which on most systems will default to int32_t. Thus, there are no prototypes for 64-bit interfaces On Oct 30, 2013, at 1:35 PM, Jim Parker wrote: > Hello, > I have recently built a cluster that uses

Re: [OMPI users] ofed installation

2013-10-30 Thread Ralph Castain
Afraid I don't, but maybe someone else here does... On Oct 30, 2013, at 1:30 PM, Robo Beans wrote: > I did try ofed forum: > https://www.openfabrics.org/forum/7-installation/882-ofed-1532.html#882 > > but was wondering if group members faced similar issue as well while > installing ofed and w

Re: [OMPI users] ofed installation

2013-10-30 Thread Jeff Squyres (jsquyres)
I think you'll have better luck with the OFED support channels -- this list is mainly about supporting Open MPI. On Oct 30, 2013, at 4:30 PM, Robo Beans wrote: > I did try ofed forum: > https://www.openfabrics.org/forum/7-installation/882-ofed-1532.html#882 > > but was wondering if group mem

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-10-30 Thread Jim Parker
Ralph, If I understand your comment, there is no standard way to define 64-bit MPI calls. So how does OpenMPI recommend I pass information? Just declaring some 64-bit integers is not working. Is there a working example some where? Cheers, --Jim On Wed, Oct 30, 2013 at 3:40 PM, Ralph Castain

Re: [OMPI users] ofed installation

2013-10-30 Thread Elken, Tom
Just to give a quick pointer... RHEL 6.4 is pretty new, and OFED 1.5.3.2 is pretty old, so that is likely the root of your issue. I believe the first OFED that supported RHEL 6.4 , which is roughly = CentOS 6.4, is OFED 3.5-1: http://www.openfabrics.org/downloads/OFED/ofed-3.5-1/ What also mig

Re: [OMPI users] OpenMPI-1.7.3 - cuda support

2013-10-30 Thread KESTENER Pierre
Thanks for your help, it is working now; I didn't noticed that limitations. Best regards, Pierre Kestener. De : users [users-boun...@open-mpi.org] de la part de Rolf vandeVaart [rvandeva...@nvidia.com] Date d'envoi : mercredi 30 octobre 2013 17:26 À : Open MPI

Re: [OMPI users] ofed installation

2013-10-30 Thread Robo Beans
Thanks guys for your time. I have latest version of kernel and kernel-devel (kernel-2.6.32-358.23.2.el6.x86_64 and kernel-devel-2.6.32-358.23.2.el6.x86_64) but i believe ofed installer was looking for base version of kernel and kernel-devel (2.6.32-358.el6.x86_64) root@scc-10-2-xx-xx:/opt/OFED-1.

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-10-30 Thread Jeff Squyres (jsquyres)
On Oct 30, 2013, at 4:35 PM, Jim Parker wrote: > I have recently built a cluster that uses the 64-bit indexing feature of > OpenMPI following the directions at > http://wiki.chem.vu.nl/dirac/index.php/How_to_build_MPI_libraries_for_64-bit_integers That should be correct (i.e., passing -i8 in

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-10-30 Thread Jim Parker
Jeff and Ralph, Ok, I downshifted to a helloWorld example (attached), bottom line after I hit the MPI_Recv call, my local variable (rank) gets borked. I have compiled with -m64 -fdefault-integer-8 and even have assigned kind=8 to the integers (which would be the preferred method in my case) You

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-10-30 Thread Jeff Squyres (jsquyres)
Can you send the information listed here: http://www.open-mpi.org/community/help/ On Oct 30, 2013, at 6:22 PM, Jim Parker wrote: > Jeff and Ralph, > Ok, I downshifted to a helloWorld example (attached), bottom line after I > hit the MPI_Recv call, my local variable (rank) gets borked. >

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-10-30 Thread Jim Parker
Jeff, Here's what I know: 1. Checked FAQs. Done 2. Version 1.6.5 3. config.log file has been removed by the sysadmin... 4. ompi_info -a from head node is in attached as headnode.out 5. N/A 6. compute node info in attached as compute-x-yy.out 7. As discussed, local variables are being overwritt

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-10-30 Thread Martin Siegert
Hi Jim, I have quite a bit experience with compiling openmpi for dirac. Here is what I use to configure openmpi: ./configure --prefix=$instdir \ --disable-silent-rules \ --enable-mpirun-prefix-by-default \ --with-threads=posix \ --enable-cxx-excepti

Re: [OMPI users] Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-10-30 Thread Jeff Squyres (jsquyres)
I've compiled your application and seen similar behavior (I snipped one of the writes and abbreviated another): - Iam = 3 received 8 Iam = 0 received 3 Iam = 1 received 8

[OMPI users] Fwd: Prototypes for Fortran MPI_ commands using 64-bit indexing

2013-10-30 Thread Jim Parker
Ok, all, where to begin... Perhaps I should start with the most pressing issue for me. I need 64-bit indexing @Martin, you indicated that even if I get this up and running, the MPI library still uses signed 32-bit ints to count (your term), or index (my term) the recvbuffer lengths. More con