On Tue, 29 May 2012 22:17:52 +0200, Jerome Kieffer <jerome.kief...@esrf.fr> wrote: > Dear PyCuda community, > > First of all I would like to introduce myself: I am a scientific > developer and I am pretty new to PyCuda (even if I followed a CUDA > course). I would like to port part of a very big application to GPU, > switching from FFTw to scikit.cuda (cu_fft part). This was straight > forward, thanks to the very good abstraction done in PyCuda. I got > already speed-up of 5x with exactly the same result compared to fftw. > > My problems starts when integrating the code into python-threads; > indeed the large application will make all PyCuda calls from different > threads and ends with memory leaks on the GPU crashing after a couple > of minutes. So I need to enforce all python threads to use the same > context on the GPU.
Right. IOW, it's a mess. Multiple threads fighting over one GPU context is not a good idea, and if you have *any* other way to design your program, use that. If you *need* to use this design, there's a way to prevent the leaks: Also manage all your memory manually (see, it's getting prettier by the minute). The problem is that it's not guaranteed that PyCUDA can activate an object's home context at garbage collection time. Alternatively, PyOpenCL (and PyFFT) make threading, even for a single context across multiple host threads, completely painless. Nearly all of the CL API is thread-safe by definition. End of story. (CL 1.2 standard, Section A.2) > I have another question: why is the data1_gpu.ptr changing whereas > data2_gpu.ptr and plan fixed (as expected in my code ? No idea. When I wrote GPUArray, I thought of .ptr as a read-only attribute. Seems scikits.cuda has a different opinion. Andreas
pgpYFNM0BKXza.pgp
Description: PGP signature
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda