Re: [PyOpenCL] Conserve memory in OpenCl Program

CRV§ADER//KY Tue, 01 Oct 2013 02:41:33 -0700

Let's see if I understood correctly,

you want to resolve a O(n^2) problem, where

out = [any(adj(input(i), input(j)) for j in G(input, i)) for i in
range(len(input)]

(in a very un-pythonic way, I’m cycling on indexes instead of values – just
because I think it helps understand better).

The problem is that I don't know what G is. N elements adjacent to i in the
input vector? (in that case the problem is O(n)). A completely arbitrary,
unpredictable function that extracts N elements from the whole input array?
(O(n^2))

The following is generic solution that works in the latter, worse case,
even in the case where G(input, i) returns a vector as large as input, and
allows to solve the problem on an arbitrarily big input (although
performance will be hurt).

You're trying to solve adj(input(i), input(j)) for all points of a square,
each point having coordinates (i,j). You can visualize the input array as
being both the vertical and the horizontal edge of the square. Now split
the edge in N segments, small enough to fit in your memory; you're thus
producing N^2 subsquares. You now have to resolve the problem on all
subsquares in turn, loading only one or two at a time into memory.

Let's say N=3. Your problem becomes

inp1, inp2, inp3 = split(input, 3)

out11 = [any(adj(inp1(i), inp1(j)) for j in G(inp1, i)) for i in
range(len(inp1)]

out12 = [any(adj(inp1(i), inp2(j)) for j in G(inp2, i)) for i in
range(len(inp1)]

out13 = [any(adj(inp1(i), inp3(j)) for j in G(inp3, i)) for i in
range(len(inp1)]

out21 = [any(adj(inp2(i), inp1(j)) for j in G(inp1, i)) for i in
range(len(inp2)]

out22 = [any(adj(inp2(i), inp2(j)) for j in G(inp2, i)) for i in
range(len(inp2)]

out23 = [any(adj(inp2(i), inp3(j)) for j in G(inp3, i)) for i in
range(len(inp2)]

out31 = [any(adj(inp3(i), inp1(j)) for j in G(inp1, i)) for i in
range(len(inp3)]

out32 = [any(adj(inp3(i), inp2(j)) for j in G(inp2, i)) for i in
range(len(inp3)]

out33 = [any(adj(inp3(i), inp3(j)) for j in G(inp3, i)) for i in
range(len(inp3)]

Then you have to recombine out.

This MAY OR MAY NOT work for you, depending on the math properties of G().
The math property I'm relying on is that I can apply G to a subset of
input, recombine the output afterwards and obtain the same result. This may
not be the case, e.g. G(input, i) = “the 4 nodes whose value is closest to
input(i)”. You can still solve this case, but it adds an additional
complication: for every output point, you’ll have to save the input that
originated it as well, and recursively apply G to the “local winner” inputs
when recombining.

As you can see, this solution presents much more memory transfers than the
monolithic one and considerably more complex to implement on the python
side - this can't be helped.

However, you'll mitigate the performance degradation if you load the next
buffer while you calculate the current one, e.g. load everything needed to
calculate out12 while you’re calculating out11.

HTH
On 1 Oct 2013 06:13, "Tyler Hardin" <[email protected]> wrote:

> Hi all, I'm writing an opencl kernel to find maximal cliques. The kernel
> selects a node by the thread gid and finds the max subgraph containing that
> node. Unfortunately, around 10000 nodes, I get an out of memory error. The
> program is for one of my math professors and tests a conjecture of his, so
> I can't post the code in it's entirety, but I'll give a simplified example.
>
> I've tried cutting down the size of the static arrays in the functions
> that have them. Is there anything else I can do other than get a card with
> more memory?
>
> Thanks for your time,
> Tyler
>
> #!/usr/bin/python2.7
>
> import pyopencl as cl
> import numpy
>
> def max_clique(nodes_list, resid_list):
>     resid_list.append(0)
>
>     num_nodes = numpy.int32(len(nodes_list))
>     num_resid = numpy.int32(len(resid_list))
>
>     nodes_arr = numpy.array(nodes_list, dtype='i4')
>     resid_arr = numpy.array(resid_list, dtype='i4')
>
>     ctx = cl.create_some_context()
>     queue = cl.CommandQueue(ctx)
>
>     mf = cl.mem_flags
>     nodes_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR,
> hostbuf=nodes_arr)
>     resid_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR,
> hostbuf=resid_arr)
>     dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, nodes_arr.nbytes)
>
>     prg = cl.Program(ctx, """
>         bool adj(int node1, int node2, __global int *adj)
>         {
>             // are node1 and node2 adjacent?
>         }
>
>         // checks if node is adj to all nodes in chain
>         bool adj_all(int* nodes, int* G, int chain_len, int node, __global
> int* adj){
>             // is node adjacent to to all nodes in G?
>         }
>
>         #define MAX_NODES 64
>         int chain(int num_nodes, int* nodes, __global int* connect)
>         {
>             int cur[MAX_NODES];
>
>             // Find max clique in G
>         }
>
>         __kernel void max_clique(__global int *nodes,
>                                             __global int *resid,
>                                             __global int *out,
>                                             int num_nodes,
>                                             int num_resid)
>         {
>             int gid = get_global_id(0);
>             int G[constant];
>             // find subgraph around nodes[gid], we call it G, store
>             // it's nodes in G
>             out[gid] = // max clique of G;
>         }
>         """).build()
>
>     prg.max_clique(queue, nodes_arr.shape, None, nodes_buf,
>                                                     resid_buf,
>                                                     dest_buf,
>                                                     num_nodes,
>                                                     num_resid)
>     cliques = numpy.empty_like(nodes_arr)
>     cl.enqueue_copy(queue, cliques, dest_buf)
>     return max(cliques)
>
> print max_clique(range(0, 10000), range(1, 10))
>
> _______________________________________________
> PyOpenCL mailing list
> [email protected]
> http://lists.tiker.net/listinfo/pyopencl
>
>

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Re: [PyOpenCL] Conserve memory in OpenCl Program

Reply via email to