Re: [PyCUDA] FFT code gets error only when Python exit

Ying Wai (Daniel) Fan Tue, 05 Jan 2010 16:59:36 -0800

Now in your situation there's a failure when reactivating the context todetach from it, probably because the runtime is meddling about. The onlyreason why cuCtxPushCurrent would throw an "invalid value", is, IMO, if thatcontext is already somewhere in the context stack. So it's likely that theruntime reactivated the context. In current git, a failure of the PushCurrentcall should not cause a failure any more--it will print a warning instead.

I believe different contexts can't share variables that is on the GPUs.I can use GPUArray objects as arguments to my fft functions, and theseobjects still exist after fft. So I think fft is using the same contextas pycuda.

I made the change indicated in the attached diff file, such thatcontext.synchronize() and context.detach() would print out the contextstack size, and detach() would also print out whether current context isactive. With this I verify that the stack size is 1 before and afterrunning fft code and the context does not change.

My guess is that CUFFT make some change to the current context, suchthat once this context is poped, it is automatically destroyed. If myguess is correct, then calling context.detach() would destroy thecontext, since its usage count drops to 0, and it could circumvent thewarning message when the context destructor is called. I don't wantpeople using my package to see warning message when Python exit, so I amnot using autoinit in my package, but to create a context explicitly.

The following is kind of unrelated. I have done some experiments withcontexts. I think context.pop() always pops up the top context from thestack, disregarding whether context is really at the top of the stack.E.g. I create two contexts c1 and c2 and then I can do c1.pop() twicewithout getting error.

Regarding the complex numbers stuff, I discovered cuComplex.h in
/usr/local/cuda/include. It includes complex.h from C library if
CU_USE_NATIVE_COMPLEX is defined. Would cuComplex.h or complex.h be a
better way to implement complex numbers?
Do you know to what extent this is documented? I wouldn't like for Nvidia tobe changing this stuff out from under us. Also, since what CUDA version has itbeen around?
If it pans out, this does sound like a pretty good plan.

cuComplex.h exists since CUDA 2.1 and it hasn't changed in subsequentversion. cuComplex.h is used by cufft.h and cublas.h. I can't find anydocumentation to it. A quick search on google shows that JCuda seems tobe using it.http://www.jcuda.org/jcuda/jcublas/doc/jcuda/jcublas/cuComplex.html

Maybe we can simply use complex.h from GNU C library. A quick seach onmy Ubuntu machine locates the following files:

/usr/include/complex.h, which includes
/usr/include/c++/4.4/complex.h, which then includes
/usr/include/c++/4.4/ccomplex, which in turn includes

/usr/include/c++/4.4/complex, which includes overloading of operatorsfor complex number.

GNU C is available in most platform and I think nvcc requires the GNU C(at least on Linux).

By the way, will you attend the PyCon2010 next month? I'll be there
presenting a poster. If you'll be there, you can consider organizing a
sprint on pycuda.
I'd love to, but my number one priority for the next few months is onfinishing my PhD. This is an excellent idea though--I'd be happy to hang outon IRC and support such a sprint remotely if there is interest

I am finishing my PhD too and I am lucky to have PyCon coming to mycity. Good luck on your PhD.


Daniel

diff --git a/src/cpp/cuda.hpp b/src/cpp/cuda.hpp
index ee3036b..bae552c 100644
--- a/src/cpp/cuda.hpp
+++ b/src/cpp/cuda.hpp
@@ -423,10 +423,14 @@ namespace cuda
           bool active_before_destruction = current_context().get() == this;
           if (active_before_destruction)
           {
+              std::cerr << "context active before destruction." << std::endl;
+	      std::cerr << get_context_stack().size() << std::endl;
             CUDAPP_CALL_GUARDED_CLEANUP(cuCtxDetach, (m_context));
           }
           else
           {
+              std::cerr << "context inactive before destruction." << std::endl;
+	      std::cerr << get_context_stack().size() << std::endl;
             if (m_thread == boost::this_thread::get_id())
             {
               CUDAPP_CALL_GUARDED_CLEANUP(cuCtxPushCurrent, (m_context));
@@ -499,7 +503,9 @@ namespace cuda
 #endif
 
       static void synchronize()
-      { CUDAPP_CALL_GUARDED_THREADED(cuCtxSynchronize, ()); }
+      {
+	std::cerr << get_context_stack().size() << std::endl;
+	CUDAPP_CALL_GUARDED_THREADED(cuCtxSynchronize, ()); }
 
       static boost::shared_ptr<context> current_context(context *except=0)
       {

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] FFT code gets error only when Python exit

Reply via email to