On Wed, 30 May 2012 22:13:27 +1200, Igor <rych...@gmail.com> wrote:
> Hi Andreas,
> I'm attaching an example for your wiki demonstrating how to find a max
> element position both using ReductionKernel and thrust-nvcc-ctypes.
> The latter doesn't quite work on windows yet. Should work if you're on
> a linux, just change the FOLDER. There is a live version published on
> my sage server (http://dev.math.canterbury.ac.nz/home/pub/26/ ) --
> there all work and show a discouraging 5-fold slowdown of
> ReductionKernel as compared to thrust (run twice, as the .so file is
> loaded lazily?). Could you take a look and edit it if necessary?

Not a fair comparison. The PyCUDA test includes the transfer of the
result to the host. (.get()) Doesn't look like that's the case for
thrust. Also, an 80 MB vector is tiny. At 200 GB/s, that's about 4e-4s,
which is in the vicinity of launch overhead.

Andreas

Attachment: pgpHSE6DhWFqx.pgp
Description: PGP signature

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to