Karl Czajkowski <[email protected]> writes:

> You've only described a K x N processing problem, where you would run
> N kernels that each process one row of K values.  You haven't
> described any cross-communication or data shuffling if there are
> multiple such sub-problems, nor approximate amount of work per input
> or output data.  Are your tasks truly independent?  What data
> management do you have to do to get your inputs into a parallel or
> distributed decomposition?
>
> At one far extreme, a high throughput job manager could be used to
> execute a set of independent PyOpenCL programs, each sized to fit on
> your OpenCL devices, each processing a different input file containing
> a subset of your N rows of data.
>
> In the middle are a huge number of choices to balance IO, memory, and
> compute resources. This leads to a huge number of different research
> programs all focusing on different niches and machine models.
>
> If you really want to abstract away the GPGPU devices, you might want
> to look at OpenMP or similar projects that have tried to add such
> devices to their targets for auto-vectorization.  I don't work in that
> space, and so have no specific recommendations.
>
> At the other extreme, I adopted PyOpenCL to allow me to do my ad-hoc
> processing in Python and Numpy with OpenCL callouts for certain
> bottlenecks.  I have some image processing tasks where there isn't
> even enough compute time per byte of input to warrant the PCIe
> transfer bottleneck in some cases.  It is the same speed to run on an
> i7-4700MQ mobile quad-core CPU (using just SIMD+multicore) as to run
> on a desktop Kepler GPU.
>
> For me, the data IO from disk or network would also dominate, so
> distributed processing is pointless. Even still, I have used
> explicit sub-block decomposition to split my large images into smaller
> OpenCL tasks that can be marshaled through the system RAM or GPU to
> improve locality and limit the intermediate working set sizes.

Karl's restatement of the problem has helped me understand better what
you might be looking for. With some work, these two could likely also help:

http://starpu.gforge.inria.fr/
http://legion.stanford.edu/

Andreas

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to