Hello,

in the meantime I made a first approach on this issue.
Unfortunately it is still very slow.
I am using atomicAdd with a shared variable at the moment.
I read somewhere that that it is recommended to use
shared instead global as it is faster, is this true in general?

This is about the code I am using:

if(condition)
{
    cnt = atomicAdd(&shrd_cnt, 1);
    out[ cnt ] = i;
}

Anything not good about it?

I am still thinking about how it could be done without
atomicAdd. One idea would be to initialize the
"out" array with e.g. -1.

And then check the array before each write if one
of the other threads has added something.

while(out[j] == -1)
{
    j++;
}
out[j] == i

But I think this is prone to errors if the memory is accessed
at the same time.

Andreas, you mentioned to do "a scan withing each block".
Do you mean something like each thread checks how many indices
it wants to add to the global list and then add them in a second
step?

Kind regards,
Joe

>> Hello,

I have a question and hope that you can help me.

A block is working on a common problem, the threads are iterating
through a part of the problem each.
Now if some condition is met, a thread should write its threadId
to a 1D output which is smaller than the total number of threads.

I would rather not store all of the results as integers.
since the condition is only met in very rare cases.

The two options I found would be

1.) to store all results in a bitfield with is as long as there are
threads and use bitwise atomicAnd.

2.) share a common index within a block which is and use the
return value of atomicAdd to store the threadId there.

Is one of this ideas to be preferred? Or do you have
better suggestions to do this?

This sounds tricky. A reasonable design might be to allocate space so
that every block has room to write out twice or three times its expected
number of outputs, use a scan within each block to compute indices, and
have some sort of failure indication (+do-over) if the allocated output
space overruns.

Andreas



_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to