"CRV§ADER//KY" <[email protected]> writes: > Hi all, > I'm looking into setting up a cluster of GPGPU nodes. The nodes would be > Linux based, and communicate between each other via ethernet. Each node > would have multiple GPUs. > > I need to run a problem that for 99% can be described as y[i] = f(x1[i], > x2[i], ... xn[i]), running on 1D vectors of data. In other words, I have n > input vectors and 1 output vector, all of the same size, and worker i-th > will exclusively need to access element i-th of every vector. > > Are there any frameworks, preferably in Python and with direct access to > OpenCL, that allow to transparently split the input data in segments, send > them over the network, do caching, feeding executor queues, etc. etc.? > > Data reuse is very heavy so if a vector is already in VRAM I don't want to > load it twice. > > Also, are there PyOpenCL bolt-ons that allow for virtual VRAM? That is, to > have more buffers than you can fit in VRAM, and transparently swap to > system RAM those thare are not immediately needed?
VirtCL is one. There was another, but I forgot what it was called. https://dl.acm.org/citation.cfm?id=2688505 Andreas _______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
