Re: [OMPI devel] CUDA kernels in OpenMPI

2017-01-30 Thread Chris Ward
I added the following lines to my 'Makefile.am' in the directory with the CUDA sources .cu.lo: /usr/local/cuda/bin/nvcc -gencode arch=compute_60,code=sm_60 -O3 --cuda -c $< mv -f $*.cu.cpp.ii $*.ii libtool --mode=compile $(CXX) $(CXXFLAGS) -c $*.ii and added the CUDA

Re: [OMPI devel] CUDA kernels in OpenMPI

2017-01-27 Thread Sylvain Jeaugey
Hi Chris, First, you will need to have some configure stuff to detect nvcc and use it inside your Makefile. UTK may have some examples to show here. For the C/C++ API, you need to add 'extern "C"' statements around the interfaces you want to export in C so that you can use them inside Open

[OMPI devel] CUDA kernels in OpenMPI

2017-01-27 Thread Chris Ward
It looks like the mailing system deleted the attachment, so here it is inline # # Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana # University Research and Technology # Corporation. All rights reserved. # Copyright (c)

[OMPI devel] CUDA kernels in OpenMPI

2017-01-27 Thread Chris Ward
Here is the complete Makefile so far. I have it in directory ompi/mca/coll/ibm , which contains an implementation of an IBM-written collectives library. It won't work as-is, because I don't know how to use 'libtool' which is presumably needed to do the compile. If anybody can show me a rule

Re: [OMPI devel] CUDA kernels in OpenMPI

2017-01-27 Thread Dmitry N. Mikushin
It's hard to tell without complete makefile example. Could you please post a minimal reprocase? Note specifically for OpenMPI there is a tricky workaround. You can use nvcc as mpicc compiler by exporting OMPI_CC=nvcc and wrapping out incompatible compiler options. Kind regards, - Dmitry

[OMPI devel] CUDA kernels in OpenMPI

2017-01-27 Thread Chris Ward
I'm trying to build a CUDA kernel into OpenMPI (because I'm experimenting with an Allreduce collective with data in GPU buffers, and I want the GPU to do the reduction). This involves writing a '.cu' file, and compiling this to '.o' with the NVIDIA CUDA compiler 'nvcc'; and also writing some