Re: [X10-users] X10 with CUDA and MPI

David E Hudak Mon, 13 Feb 2012 10:07:29 -0800

OK, I followed these instructions:
http://x10-lang.org/documentation/practical-x10-programming/x10-on-gpus


…and got CUDATopology to work:
1004  x10c++ -O -NO_CHECKS -x10rt mpi CUDATopology.x10 -o CUDATopology
 1005  X10RT_ACCELS=ALL mpiexec -pernode ./CUDATopology
dhudak@n0282 1012%> !1005
X10RT_ACCELS=ALL mpiexec -pernode ./CUDATopology
Dumping places at place: Place(0)
Place: Place(0)
  Parent: Place(0)
  NumChildren: 2
  Is a Host place
  Child 0: Place(2)
    Parent: Place(0)
    NumChildren: 0
    Is a CUDA place
  Child 1: Place(3)
    Parent: Place(0)
    NumChildren: 0
    Is a CUDA place
Place: Place(1)
  Parent: Place(1)
  NumChildren: 2
  Is a Host place
  Child 0: Place(4)
    Parent: Place(1)
    NumChildren: 0
    Is a CUDA place
  Child 1: Place(5)
    Parent: Place(1)
    NumChildren: 0
    Is a CUDA place
 
Dumping places at place: Place(1)
Place: Place(0)
  Parent: Place(0)
  NumChildren: 2
  Is a Host place
  Child 0: Place(2)
    Parent: Place(0)
    NumChildren: 0
    Is a CUDA place
  Child 1: Place(3)
    Parent: Place(0)
    NumChildren: 0
    Is a CUDA place
Place: Place(1)
  Parent: Place(1)
  NumChildren: 2
  Is a Host place
  Child 0: Place(4)
    Parent: Place(1)
    NumChildren: 0
    Is a CUDA place
  Child 1: Place(5)
    Parent: Place(1)
    NumChildren: 0
    Is a CUDA place

…but, other examples are not building.  I am assuming its the new version of 
X10 along with the new version of CUDA, but I figured I would pass it along to 
the mailing list.

dhudak@oak-rw 999%> module list
Currently Loaded Modules:
  1) torque/2.5.10  2) moab/6.1.4  3) modules/1.0  4) gnu/4.4.5  5) 
mvapich2/1.7  6) mkl/10.3.0  7) cuda/4.1.28  8) java/1.7.0_02  9) x10/2.2.2-cuda
dhudak@oak-rw 1000%> which nvcc
/usr/local/cuda/4.1.28/bin/nvcc
dhudak@oak-rw 1001%> x10c++ -O -NO_CHECKS -x10rt mpi CUDA3DFD.x10 -o CUDA3DFD

x10c++: nvcc fatal : Value 'sm_30' is not defined for option 'gpu-architecture' 
   
x10c++: Non-zero return code: 255
x10c++: Found @CUDA annotation, but not compiling for GPU because nvcc could 
not be run (check your $PATH).
dhudak@oak-rw 1002%> 
dhudak@oak-rw 1002%> x10c++ -O -NO_CHECKS -x10rt mpi CUDAKernelTest.x10 -o 
CUDAKernelTest
x10c++: ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to, 
assuming global memory space    
     ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to, 
assuming global memory space    
x10c++: ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to, 
assuming global memory space    
     ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to, 
assuming global memory space    
x10c++: ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to, 
assuming global memory space    
     ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to, 
assuming global memory space    
x10c++: ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to, 
assuming global memory space    
     ./CUDAKernelTest.cu(56): Warning: Cannot tell what pointer points to, 
assuming global memory space    
x10c++: nvcc fatal : Value 'sm_30' is not defined for option 'gpu-architecture' 
   
x10c++: Non-zero return code: 255
x10c++: Found @CUDA annotation, but not compiling for GPU because nvcc could 
not be run (check your $PATH).
dhudak@oak-rw 1003%> which nvcc
/usr/local/cuda/4.1.28/bin/nvcc

Regards,
Dave

On Feb 11, 2012, at 5:09 PM, David E Hudak wrote:

> Hi All,
> 
> I have a code sample that I want to try on our new cluster.  These are 
> dual-socket nodes with dual-M2070 cards connected by QDR IB.
> 
> I configured my local environment and built the code as follows:
> svn co https://x10.svn.sourceforge.net/svnroot/x10/tags/SF_RELEASE_2_2_2 
> x10-2.2.2
> cd x10-2.2.2/x10.dist
> ant -DNO_CHECKS=true -Doptimize=true -DX10RT_MPI=true -DX10RT_CUDA=true diet
> 
> Things build.  
> 
> And, then I get an interactive PBS job on 2 nodes.  I would like the launch 
> the program with 2 X10 places per node, with each X10 place having one child 
> place for a GPU.  Does anyone have the incantation that would launch this 
> configuration?
> 
> By the way, is there a hostname function in X10 I can call to verify which 
> node I am running on?
> 
> So, first I tried...
> 
> dhudak@n0282 1021%> mpiexec -pernode ./CUDATopology
> Dumping places at place: Place(0)
> Place: Place(0)
>  Parent: Place(0)
>  NumChildren: 0
>  Is a Host place
> 
> Dumping places at place: Place(0)
> Place: Place(0)
>  Parent: Place(0)
>  NumChildren: 0
>  Is a Host place
> 
> …and it ran two copies of the program, each on the two nodes.  (I verified by 
> running top on the other node, and seeing a CUDATopology process running.)
> 
> If I add the X10RT_ACCELS variable, each copy finds the two cards:
> 
> dhudak@n0282 1012%> X10RT_ACCELS=ALL mpiexec -pernode ./CUDATopology
> Dumping places at place: Place(0)
> Place: Place(0)
>  Parent: Place(0)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(1)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(2)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
> 
> Dumping places at place: Place(0)
> Place: Place(0)
>  Parent: Place(0)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(1)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(2)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
> 
> OK, so I wanted place 1 on one node and place 2 on another node:
> 
> dhudak@n0282 1029%> X10RT_ACCELS=ALL X10_NPLACES=2 mpiexec -pernode 
> ./CUDATopology
> Dumping places at place: Place(0)
> Place: Place(0)
>  Parent: Place(0)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(2)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(3)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
> Place: Place(1)
>  Parent: Place(1)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(4)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(5)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
> 
> Dumping places at place: Place(1)
> Place: Place(0)
>  Parent: Place(0)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(2)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(3)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
> Place: Place(1)
>  Parent: Place(1)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(4)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(5)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
> 
> Dumping places at place: Place(0)
> Place: Place(0)
>  Parent: Place(0)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(2)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(3)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
> Place: Place(1)
>  Parent: Place(1)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(4)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(5)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
> 
> Dumping places at place: Place(1)
> Place: Place(0)
>  Parent: Place(0)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(2)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(3)
>    Parent: Place(0)
>    NumChildren: 0
>    Is a CUDA place
> Place: Place(1)
>  Parent: Place(1)
>  NumChildren: 2
>  Is a Host place
>  Child 0: Place(4)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
>  Child 1: Place(5)
>    Parent: Place(1)
>    NumChildren: 0
>    Is a CUDA place
> 
> Does anyone have any advice?
> 
> Thanks,
> Dave
> ---
> David E. Hudak, Ph.D.          [email protected]
> Program Director, HPC Engineering
> Ohio Supercomputer Center
> http://www.osc.edu
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------------
> Virtualization & Cloud Management Using Capacity Planning
> Cloud computing makes use of virtualization - but cloud computing 
> also focuses on allowing computing to be delivered as a service.
> http://www.accelacomm.com/jaw/sfnl/114/51521223/
> _______________________________________________
> X10-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/x10-users

---
David E. Hudak, Ph.D.          [email protected]
Program Director, HPC Engineering
Ohio Supercomputer Center
http://www.osc.edu










------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
X10-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/x10-users

Re: [X10-users] X10 with CUDA and MPI

Reply via email to