Dave,
Doesn't KMeansCUDA require a points.dat file? If you are running in
x10.dist/samples/CUDA, you can try
$ runx10 KMeansCUDA -i 50 -p ../points.dat
HTH,
Igor
David E Hudak <[email protected]> wrote on 11/23/2010 04:36:54 PM:
> Thanks, Dave.
>
> OK, so I did an svn up and retest. It all worked except KMeansCUDA.
>
> Here’s what worked:
>
> dhu...@opt2648 535%> x10c++ -O -NO_CHECKS -STATIC_CALLS CUDATopology.
> x10 -o CUDATopology
> dhu...@opt2648 524%> runx10 CUDATopology
> X10_NPLACES not set. Assuming 1 place, running locally
> Dumping places at place: (Place 0)
> Place: (Place 0)
> Parent: (Place 0)
> NumChildren: 2
> Is a Host place
> Child 0: (Place 1)
> Parent: (Place 0)
> NumChildren: 0
> Is a CUDA place
> Child 1: (Place 2)
> Parent: (Place 0)
> NumChildren: 0
> Is a CUDA place
>
>
> dhu...@opt2648 536%> x10c++ -O -NO_CHECKS -STATIC_CALLS
> CUDABlackScholes.x10 -o CUDABlackScholes
> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10.
> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap"
> defines no constructor to initialize the following:
> const member "_X10ClosureMap::_x10members"
>
>
> ptxas info : Compiling entry function
> 'CUDABlackScholes__closure__0' for 'sm_10'
> ptxas info : Used 17 registers, 96+16 bytes smem, 65536 bytes
> cmem[0], 40 bytes cmem[1]
>
> dhu...@opt2648 525%> runx10 CUDABlackScholes
> X10_NPLACES not set. Assuming 1 place, running locally
> Using the GPU at place (Place 1)
> This program only supports a single GPU.
> Running 512 times on place (Place 1)
> Options count : 8000000
> BlackScholesGPU() time : 1.058917974016E12 msec
> Effective memory bandwidth: 75.548812866210938 GB/s
> Gigaoptions per second : 7.55488109588623
> Generating a second set of results at place (Place 0)
> Verifying the reuslts match...
> L1 norm: 1.0E-7
> Max absolute error: 6.0E-7
>
> TEST PASSED
>
>
> dhu...@opt2648 528%> x10c++ -O -NO_CHECKS -STATIC_CALLS CUDA3DFD.x10 -o
CUDA3DFD
> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10.
> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap"
> defines no constructor to initialize the following:
> const member "_X10ClosureMap::_x10members"
>
> ptxas info : Compiling entry function 'CUDA3DFD__closure__1' for
'sm_10'
> ptxas info : Used 18 registers, 80+16 bytes smem, 65536 bytes
> cmem[0], 24 bytes cmem[1]
>
> dhu...@opt2648 529%> runx10 ./CUDA3DFD
> X10_NPLACES not set. Assuming 1 place, running locally
> 480x480x400
> allocated 703.125000 MB on device
> -------------------------------
> time: 22 ms
> throughput: 4105.30908203125 MPoints/s
> -------------------------------
>
> comparing to CPU result...
>
> Result within epsilon
>
>
> TEST PASSED!
>
>
> dhu...@opt2648 532%> x10c++ -O -NO_CHECKS -STATIC_CALLS CUDAMatMul.x10
> -o CUDAMatMul
> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10.
> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap"
> defines no constructor to initialize the following:
> const member "_X10ClosureMap::_x10members"
>
> ptxas info : Compiling entry function 'CUDAMatMul__closure__0'
> for 'sm_10'
> ptxas info : Used 56 registers, 64+0 bytes lmem, 96+16 bytes
> smem, 65536 bytes cmem[0], 24 bytes cmem[1]
>
> dhu...@opt2648 533%> runx10 CUDAMatMul
> X10_NPLACES not set. Assuming 1 place, running locally
>
>
> testing sgemm( 'N', 'N', n, n, n, ... )
>
>
> 4096 31.258859505094609 GF/s in 4.396800000000001 seconds
>
>
>
> Here is what did not:
>
> dhu...@opt2648 537%> x10c++ -O -NO_CHECKS -STATIC_CALLS KMeansCUDA.x10
> -o KMeansCUDA
> x10c++: /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10.
> dist/include/x10aux/debug.h(124): warning: class "_X10ClosureMap"
> defines no constructor to initialize the following:
> const member "_X10ClosureMap::_x10members"
> ptxas info : Compiling entry function 'KMeansCUDA__closure__5'
> for 'sm_10'
> ptxas info : Used 13 registers, 80+16 bytes smem, 65536 bytes
> cmem[0], 80 bytes cmem[1]
>
> dhu...@opt2648 538%> runx10 KMeansCUDA -i 50
> X10_NPLACES not set. Assuming 1 place, running locally
> points: 100000 clusters: 8 dim: 4
> x10.io.FileNotFoundException: points.dat
> at x10::lang::Throwable::fillInStackTrace()
> at x10aux::io::FILEPtrStream::open_file(x10aux::ref<x10::
> lang::String> const&, char const*)
> at x10::io::FileReader__FileInputStream::_make(x10aux::
> ref<x10::lang::String>)
> at x10::io::FileReader::_constructor(x10aux::ref<x10::io::File>)
> at x10::io::FileReader::_make(x10aux::ref<x10::io::File>)
> at x10::io::File::openRead()
> at KMeansCUDA::main(x10aux::ref<x10::array::Array<x10aux::
> ref<x10::lang::String> > >)
> at x10aux::BootStrapClosure::apply()
> at x10_lang_Runtime__closure__2::apply()
> at x10::lang::Activity::run()
> at x10::lang::Runtime__Worker::loop(x10aux::ref<x10::lang::
> SimpleLatch>, bool)
> at x10::lang::Runtime__Worker::apply()
> at x10::lang::Runtime__Pool::apply()
> at x10::lang::Runtime::start(x10aux::ref<x10::lang::
> VoidFun_0_0>, x10aux::ref<x10::lang::VoidFun_0_0>)
> at int x10aux::template_main<x10::lang::Runtime,
> KMeansCUDA>(int, char**)
> at __libc_start_main
> at __gxx_personality_v0
>
> Dave
> On Nov 23, 2010, at 3:00 PM, Dave Cunningham wrote:
>
> > Hi
> >
> > The error message doesn't necessarily imply that nvcc couldn't be
found, in
> > fact the errors you got were from nvcc, and we print the same message
no
> > matter how the invocation of nvcc fails.
> >
> > The problem was a regression caused by a recent change in SVN, it's
now
> > fixed in r18467
> >
> > thanks for your interest in CUDA
> >
> >
> >
> > On Tue, Nov 23, 2010 at 1:51 PM, David E Hudak <[email protected]> wrote:
> >
> >> Hi All,
> >>
> >> I am building X10 from the trunk (checked out last night) and am
running
> >> into a problem with executing nvcc:
> >> ----------------------------------------
> >> dhu...@opt2648 509%> x10c++ -O -NO_CHECKS -STATIC_CALLS
CUDAMatMul.x10 -o
> >> CUDAMatMul
> >> x10c++:
> >> /nfs/07/dhudak/devel/x10/20101122r2/x10-trunk/x10.
> dist/include/x10aux/debug.h(124):
> >> warning: class "_X10ClosureMap" defines no constructor to initialize
the
> >> following:
> >> const member "_X10ClosureMap::_x10members"
> >>
> >> CUDAMatMul.cu(41): error: namespace "x10aux" has no member
"zeroCheck"
> >>
> >> CUDAMatMul.cu(44): error: namespace "x10aux" has no member
"zeroCheck"
> >>
> >> CUDAMatMul.cu(47): error: namespace "x10aux" has no member
"zeroCheck"
> >>
> >> CUDAMatMul.cu(50): error: namespace "x10aux" has no member
"zeroCheck"
> >>
> >> 4 errors detected in the compilation of
> >> "/tmp/tmpxft_0000626f_00000000-4_CUDAMatMul.cpp1.ii".
> >> x10c++: Non-zero return code: 2
> >> x10c++: Found @CUDA annotation, but not compiling for GPU because
nvcc
> >> could not be run (check your $PATH).
> >> ----------------------------------------
> >>
> >> Unfortunately, nvcc is in my path:
> >>
> >> ----------------------------------------
> >> dhu...@opt2648 510%> which nvcc
> >> /usr/local/cuda-3.1/cuda/bin/nvcc
> >> ----------------------------------------
> >>
> >> Any suggestions?
> >>
> >> Thanks,
> >> Dave
--
Igor Peshansky (note the spelling change!)
IBM T.J. Watson Research Center
X10: Parallel Productivity and Performance (http://x10-lang.org/)
XJ: No More Pain for XML's Gain (http://www.research.ibm.com/xj/)
"I hear and I forget. I see and I remember. I do and I understand" --
Xun Zi
------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
X10-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/x10-users