Re: [PyCUDA] Compiling thrust code in pyCUDA

Bryan Catanzaro Thu, 31 May 2012 09:03:20 -0700

Yup, it can make a difference. =)

The trick you mention for conjugate gradient works because the only
thing control flow has to know is whether to launch another iteration
- but it doesn't need to know what to do during that iteration. The
actual work to be performed in each iteration of CG is independent of
the state of the solver.  This isn't the case for many other important
optimization problems, where the next optimization step depends on the
value of the result of the current step.


- bryan

On Thu, May 31, 2012 at 8:18 AM, Andreas Kloeckner
<li...@informa.tiker.net> wrote:
> Bryan Catanzaro <bcatanz...@acm.org> writes:
>
>> I agree that data size matters in these discussions.  But I think the
>> right way to account for it is show performance at a range of data
>> sizes, as measured from Python.
>>
>> The assumption that you'll keep the GPU busy isn't necessarily true.
>> thrust::reduce, for example (which max_element uses internally),
>> launches a big kernel, followed by a small kernel to finish the
>> reduction tree, followed by a cudaMemcpy to transfer the result back
>> to the host.  The GPU won't be busy during the small kernel, nor
>> during the cudaMemcpy, nor during the conversion back to Python, etc.
>> Reduce is often used to make control flow decisions in optimization
>> loops, where you don't know what the next optimization step to be
>> performed is until the result is known, and so you can't launch the
>> work speculatively.  If the control flow is performed in Python, all
>> these overheads are exposed to application performance - so I think
>> they matter.
>
> Glad you brought that up. :) The conjugate gradient solver in PyCUDA
> addresses exactly that by simply running iterations as fast as it can
> and shepherding the residual results to the host on their own time,
> deferring convergence decisions until the data is available. That was
> good for a 20% or so gain last time I measured it (on a GT200).
>
> Andreas
>

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] Compiling thrust code in pyCUDA

Reply via email to