Hi,
I'm trying to write a library to support multiple GPUs in one Python
thread, using the context-juggling approach suggested in the FAQ, but
I'm having issues getting it to work properly. The general objective is
to have each library master object be associated with its own CUDA
context so that the user code can instantiate several objects, one (or
more, for whatever reason) for each GPU. The library methods should deal
with pushing and popping their contexts.
The first problem (which is to say, the one for which I've currently got
a minimal test case :)) is that I can't figure out how to get cleanup at
the end of the program to work properly.
Here's minimal test case 1:
----------
#!/usr/bin/python
import pycuda.driver as cuda
cuda.init()
print "Initialized CUDA"
class gpuObject:
def __init__(self,deviceID=0):
self.device = cuda.Device(deviceID)
self.context = self.device.make_context(cuda.ctx_flags.SCHED_YIELD)
def method1(self):
self.context.push()
# Do some work...
self.context.pop()
def main():
gpu0 = gpuObject(0)
gpu1 = gpuObject(1)
if __name__ == "__main__":
main()
---------
If I run this program, at the end I get a warning about destroying an
active context (fair enough, I never popped it), and then a crash:
Initialized CUDA
-----------------------------------------------------------
PyCUDA WARNING: I'm being asked to destroy a
context that's part of the current context stack.
-----------------------------------------------------------
I will pick the next lower active context from the
context stack. Since this choice is happening
at an unspecified point in time, your code
may be making false assumptions about which
context is active at what point.
Call Context.pop() to avoid this warning.
-----------------------------------------------------------
If Python is terminating abnormally (eg. exiting upon an
unhandled exception), you may ignore this.
-----------------------------------------------------------
terminate called after throwing an instance of 'cuda::error'
what(): cuCtxPushCurrent failed: invalid value
Aborted
I then tried adding a __del__ method to the gpuObject:
def __del__(self):
self.context.pop()
return
This leads to a segfault. Here's the output (along with some lines from
instrumentation I added to src/cpp/cuda.hpp to try to help debug this):
Initialized CUDA
CUDA.HPP: popped context, new use count = 0
CUDA.HPP: Acquired context weak-pointer OK in
context::current_context(), returning
CUDA.HPP: Acquired context weak-pointer OK in
context::current_context(), returning
CUDA.HPP: Weak-pointer invalid (.expired() == true) in
context::current_context(), popping and retrying
CUDA.HPP: Context stack was empty in context::current_context(),
returning null shared_ptr
CUDA.HPP: Context stack was empty in context::current_context(),
returning null shared_ptr
Segmentation fault
Any ideas what's going on here? The two deviceIDs can be the same or
different; it makes no difference. I have some other context management
related errors (e.g., no context being active immediately after a
context.push()) but I'm still trying to work out a minimal test case for
them, and they may be related to this.
Thanks for an awesome library!
Imran
_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net