I second that the caching should be done internally by opencl, as this
would produce more readable code and avoid obscure bugs like the opener's.
Is there any reason against it?

As a general python rule, if you have a @property that takes a
non-negligible amount of time to compute after the first invocation, you're
doing it wrong and it should be either cached or replaced by a function.
Just my 2c.
On 14 Feb 2016 3:50 am, "Dustin Kleckner" <[email protected]> wrote:

> Thanks for the quick reply — that fixed it.
>
> Just out of curiosity, is there a compelling reason not to cache the
> kernel code in the program objects, and then quickly returning repeated
> calls?  I generally wouldn’t expect calling a method repeatedly to be
> significant slower than getting a copy and then calling it.  I guess what
> you’re saying is that I shouldn’t think of “pgm.sum” as a method, but
> rather as an argument-less function that returns a method?  In this case
> shouldn’t I expect the syntax to be “sum_knl = prg.sum()”?
>
> Best,
> Dustin
>
> PS: Thanks for writing pyopencl — it has made my life much easier!
>
>
> > On Feb 13, 2016, at 6:42 PM, Andreas Kloeckner <[email protected]>
> wrote:
> >
> > Dustin Kleckner <[email protected]> writes:
> >> I’ve been using pyopencl for awhile for various simulation/data
> processing tasks.  I recently upgraded to a new computer, and noticed
> things were considerably slower.
> >>
> >> After some experimentation, I tracked this down to the version of
> pyopencl I was using.  The updated version (2015.2.4; most recent on pypi)
> takes significantly longer to queue a function call (~1.5 ms) than the old
> version (2015.1, ~0.03 ms).  Both times come from the same machine*.
> Profiling indicates that the newer version is making lots of function calls
> the old version did not.  FYI, the code I used to test this is below
> (adapted from documentation).
> >>
> >> For my purposes, this is slightly alarming: my code makes lots of
> kernel calls, in which case the new version is 50x slower for small data
> sets!
> >>
> >> Is this something that has been/will be fixed in newer versions of
> pyopencl?  Is there a workaround?  Of course, for the time being I can use
> the old version, but I’d rather not be stuck with it.
> >>
> >> If needed, I can provide the profiler output.
> >
> > tl;dr: Hang on to the kernel object, i.e. 'sum_knl = prg.sum'. It's used
> > for caching stuff.
> >
> > PyOpenCL 2015.2 generates custom Python code to make kernel invocation
> > *faster* (not slower). Generating this code (which gets attached to the
> > kernel object, prg.sum) takes time, and every time you call 'prg.sum',
> > you get a new kernel object. So you're likely mainly benchmarking the
> > generation (and compilation) of the invoker code.
> >
> > HTH,
> > Andreas
>
>
> _______________________________________________
> PyOpenCL mailing list
> [email protected]
> https://lists.tiker.net/listinfo/pyopencl
>
_______________________________________________
PyOpenCL mailing list
[email protected]
https://lists.tiker.net/listinfo/pyopencl

Reply via email to