Hi Charles,

 > I was benchmarking 4096x4096 matrices (again, with my R bindings).  By
> 'slower' I mean that I am observing OpenCL at this size beating the
> OpenBLAS CPU implementation by over 2X but the CUDA implementation is
> nearly 5X slower than the CPU.  This seemed odd to me that the CUDA
> would be so much slower than the OpenCL, hence my initial thought to
> invite others to review my code if I am making some sort of silly
> mistake.  Otherwise I was intending to begin trying to pursue direct
> cublas methods but I would very much prefer to use ViennaCL.

okay, in this case what Philippe was just the full answer. Our OpenCL 
kernels are highly GPU-specific and generate a 'good' kernel at runtime. 
We haven't 'ported' (i.e. a one-to-one translation from OpenCL to CUDA) 
these kernels to the CUDA backend yet, so only a fallback kernel is used 
for the CUDA backend. It should be possible to carry these over with not 
too much effort, but in such case it makes more sense to just call the 
cuBLAS routines instead. Adding this for ViennaCL 1.7.1 is certainly 
possible if that is what you would be happy with.

Best regards,
Karli



> On Sat, Aug 1, 2015 at 3:56 AM, Karl Rupp <r...@iue.tuwien.ac.at
> <mailto:r...@iue.tuwien.ac.at>> wrote:
>
>     Hi Charles,
>
>     can you please quantify what you mean by 'slower'? How does 'slower'
>     change as you increase the problem size? I would not be surprised if
>     you see no performance gains below matrices of size 500-by-500. With
>     the extra back-and-forth through PCI-Express you may even need
>     matrices of at least 1000-by-1000.
>
>     Best regards,
>     Karli
>
>
>
>     On 07/31/2015 09:04 PM, Charles Determan wrote:
>
>         Greetings,
>
>         Brief background, I am developing a series of R packages to bring
>         ViennaCL to the R community.  I have had success with the
>         development of
>         my gpuR package (https://github.com/cdeterman/gpuR) which relies
>         on the
>         OpenCL backend of ViennaCL (which is housed in the package
>         RViennaCL).
>         I am hoping to submit to CRAN in the coming weeks now that the
>         latest
>         stable ViennaCL version has just been released.
>
>         Naturally, I wanted a companion package for a CUDA backend.
>         This is now
>         the gpuRcuda package (https://github.com/cdeterman/gpuRcuda).
>         This has
>         appeared to work successfully as most of the code is the same.
>         However,
>         my initial benchmarks are showing very dismal performance with
>         the CUDA
>         backend.
>
>         I was wondering if someone from this list would be willing to have a
>         look at my code to see why the CUDA code would be so much
>         worse.  I had
>         thought, given working a NVIDIA card (GeForce GTX 970), CUDA would
>         provide improved speed but the benchmarks are showing performance at
>         least 5-fold slower than the CPU based R multiplication.  Even the
>         'float' type matrix multiplication is slower than R (which only has
>         double type support!).
>
>         The sgemm CUDA file is
>         (https://github.com/cdeterman/gpuRcuda/blob/master/src/vcl_sgemm.cu)
>         and
>         the associated C++ file is
>         
> (https://github.com/cdeterman/gpuRcuda/blob/master/src/vcl_cudaMatrix_gemm.cpp).
>
>         Other note, I have tried making the two packages completely
>         independent
>         and the performance is still very poor with CUDA.
>
>         I really appreciate any help others could provide
>         troubleshooting this.
>         I have truly run out of ideas as to why the code has such poor
>         performance.
>
>         Regards,
>         Charles
>
>
>         
> ------------------------------------------------------------------------------
>
>
>
>         _______________________________________________
>         ViennaCL-devel mailing list
>         ViennaCL-devel@lists.sourceforge.net
>         <mailto:ViennaCL-devel@lists.sourceforge.net>
>         https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>
>
>


------------------------------------------------------------------------------
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to