Hi all, OpenCL grid enqueue is subject to a data race, as you may know. Behind the scenes, first the arguments are set and then the grid execution is enqueued. This means that another thread could be racing the current one to set its arguments and enqueue its execution. To the best of my knowledge, it's the only spot where the CL API is not thread-safe. I'm inclined to have PyOpenCL guard against users falling into this trap by adding a lock to every kernel and protecting the 'set-enqueue' section against races. This would only affect Kernel.__call__. Anyone using set_args and enqueue_nd_range_kernel manually would have to take care of this themselves. It seems the overhead of this would be pretty negligible in the sequential case:
$ python -m timeit -s "import threading; l = threading.Lock()" -- "l.acquire(); l.release()" 10000000 loops, best of 3: 0.177 usec per loop $ python -m timeit -s "import threading; l = threading.Lock()" -- "pass" 100000000 loops, best of 3: 0.00838 usec per loop $ python -m timeit -s "import threading; l = threading.Lock(); u = 0" -- "u+=1" 10000000 loops, best of 3: 0.0277 usec per loop If any of you have an opinion, positive or negative, I'd be happy to hear it. Andreas _______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
