Re: [PyOpenCL] Conserve memory in OpenCl Program

Tyler Hardin Tue, 01 Oct 2013 06:42:32 -0700

I don't think that will work for this. I should have been a little more
precise in my short, stripped down version. The nodes are all integers and
two nodes are adjacent if their difference is equal to a number in resid.
The desired output is actually just the size of the max clique. If I
understand (not sure), what it looks like you're doing is only the first
part of the algo described below. I might be able to do that part in
Python, then the clique finding in OpenCL though, which would reduce the
size of memory needed for the CL part, so thanks!


Let's call the big graph, the one with all nodes in the array nodes[], H.
First, for gid N, I find the neighbourhood of nodes[N]. This forms a
subgraph, we'll call it HN. I don't know anything about this graph (whether
it's complete). Then, I take the resulting graph, and find it's max clique.
It really isn't an algorithmic improvement over the single threaded version
of the problem (actually, it's worse, because, say H has a max clique of C,
I'll find that same max clique at least C times. Once for each node in the
max clique.) The advantage comes from the average massive shrink in the
size of the graphs I'm working with. Instead of finding the max clique in a
graph of size |H|, I'm finding the max clique of N graphs of size less than
or equal to the largest order of any vertex in H.


On Tue, Oct 1, 2013 at 5:40 AM, CRV§ADER//KY <[email protected]> wrote:

> Let's see if I understood correctly,
>
> you want to resolve a O(n^2) problem, where
>
>
>
> out = [any(adj(input(i), input(j)) for j in G(input, i)) for i in
> range(len(input)]
>
>
>
> (in a very un-pythonic way, I’m cycling on indexes instead of values –
> just because I think it helps understand better).
>
>
>
> The problem is that I don't know what G is. N elements adjacent to i in
> the input vector? (in that case the problem is O(n)). A completely
> arbitrary, unpredictable function that extracts N elements from the whole
> input array? (O(n^2))
>
> The following is generic solution that works in the latter, worse case,
> even in the case where G(input, i) returns a vector as large as input, and
> allows to solve the problem on an arbitrarily big input (although
> performance will be hurt).
>
>
>
> You're trying to solve adj(input(i), input(j)) for all points of a square,
> each point having coordinates (i,j). You can visualize the input array as
> being both the vertical and the horizontal edge of the square. Now split
> the edge in N segments, small enough to fit in your memory; you're thus
> producing N^2 subsquares. You now have to resolve the problem on all
> subsquares in turn, loading only one or two at a time into memory.
>
>
>
> Let's say N=3. Your problem becomes
>
> inp1, inp2, inp3 = split(input, 3)
>
> out11 = [any(adj(inp1(i), inp1(j)) for j in G(inp1, i)) for i in
> range(len(inp1)]
>
> out12 = [any(adj(inp1(i), inp2(j)) for j in G(inp2, i)) for i in
> range(len(inp1)]
>
> out13 = [any(adj(inp1(i), inp3(j)) for j in G(inp3, i)) for i in
> range(len(inp1)]
>
> out21 = [any(adj(inp2(i), inp1(j)) for j in G(inp1, i)) for i in
> range(len(inp2)]
>
> out22 = [any(adj(inp2(i), inp2(j)) for j in G(inp2, i)) for i in
> range(len(inp2)]
>
> out23 = [any(adj(inp2(i), inp3(j)) for j in G(inp3, i)) for i in
> range(len(inp2)]
>
> out31 = [any(adj(inp3(i), inp1(j)) for j in G(inp1, i)) for i in
> range(len(inp3)]
>
> out32 = [any(adj(inp3(i), inp2(j)) for j in G(inp2, i)) for i in
> range(len(inp3)]
>
> out33 = [any(adj(inp3(i), inp3(j)) for j in G(inp3, i)) for i in
> range(len(inp3)]
>
> Then you have to recombine out.
>
>
>
> This MAY OR MAY NOT work for you, depending on the math properties of G().
> The math property I'm relying on is that I can apply G to a subset of
> input, recombine the output afterwards and obtain the same result. This may
> not be the case, e.g. G(input, i) = “the 4 nodes whose value is closest to
> input(i)”. You can still solve this case, but it adds an additional
> complication: for every output point, you’ll have to save the input that
> originated it as well, and recursively apply G to the “local winner” inputs
> when recombining.
>
>
>
> As you can see, this solution presents much more memory transfers than the
> monolithic one and considerably more complex to implement on the python
> side - this can't be helped.
>
> However, you'll mitigate the performance degradation if you load the next
> buffer while you calculate the current one, e.g. load everything needed to
> calculate out12 while you’re calculating out11.
>
> HTH
> On 1 Oct 2013 06:13, "Tyler Hardin" <[email protected]> wrote:
>
>> Hi all, I'm writing an opencl kernel to find maximal cliques. The kernel
>> selects a node by the thread gid and finds the max subgraph containing that
>> node. Unfortunately, around 10000 nodes, I get an out of memory error. The
>> program is for one of my math professors and tests a conjecture of his, so
>> I can't post the code in it's entirety, but I'll give a simplified example.
>>
>> I've tried cutting down the size of the static arrays in the functions
>> that have them. Is there anything else I can do other than get a card with
>> more memory?
>>
>> Thanks for your time,
>> Tyler
>>
>> #!/usr/bin/python2.7
>>
>> import pyopencl as cl
>> import numpy
>>
>> def max_clique(nodes_list, resid_list):
>>     resid_list.append(0)
>>
>>     num_nodes = numpy.int32(len(nodes_list))
>>     num_resid = numpy.int32(len(resid_list))
>>
>>     nodes_arr = numpy.array(nodes_list, dtype='i4')
>>     resid_arr = numpy.array(resid_list, dtype='i4')
>>
>>     ctx = cl.create_some_context()
>>     queue = cl.CommandQueue(ctx)
>>
>>     mf = cl.mem_flags
>>     nodes_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR,
>> hostbuf=nodes_arr)
>>     resid_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR,
>> hostbuf=resid_arr)
>>     dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, nodes_arr.nbytes)
>>
>>     prg = cl.Program(ctx, """
>>         bool adj(int node1, int node2, __global int *adj)
>>         {
>>             // are node1 and node2 adjacent?
>>         }
>>
>>         // checks if node is adj to all nodes in chain
>>         bool adj_all(int* nodes, int* G, int chain_len, int node,
>> __global int* adj){
>>             // is node adjacent to to all nodes in G?
>>         }
>>
>>         #define MAX_NODES 64
>>         int chain(int num_nodes, int* nodes, __global int* connect)
>>         {
>>             int cur[MAX_NODES];
>>
>>             // Find max clique in G
>>         }
>>
>>         __kernel void max_clique(__global int *nodes,
>>                                             __global int *resid,
>>                                             __global int *out,
>>                                             int num_nodes,
>>                                             int num_resid)
>>         {
>>             int gid = get_global_id(0);
>>             int G[constant];
>>             // find subgraph around nodes[gid], we call it G, store
>>             // it's nodes in G
>>             out[gid] = // max clique of G;
>>         }
>>         """).build()
>>
>>     prg.max_clique(queue, nodes_arr.shape, None, nodes_buf,
>>                                                     resid_buf,
>>                                                     dest_buf,
>>                                                     num_nodes,
>>                                                     num_resid)
>>     cliques = numpy.empty_like(nodes_arr)
>>     cl.enqueue_copy(queue, cliques, dest_buf)
>>     return max(cliques)
>>
>> print max_clique(range(0, 10000), range(1, 10))
>>
>> _______________________________________________
>> PyOpenCL mailing list
>> [email protected]
>> http://lists.tiker.net/listinfo/pyopencl
>>
>>
> _______________________________________________
> PyOpenCL mailing list
> [email protected]
> http://lists.tiker.net/listinfo/pyopencl
>
>

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Re: [PyOpenCL] Conserve memory in OpenCl Program

Reply via email to