Hi,

I'm trying to write a library to support multiple GPUs in one Python thread, using the context-juggling approach suggested in the FAQ, but I'm having issues getting it to work properly. The general objective is to have each library master object be associated with its own CUDA context so that the user code can instantiate several objects, one (or more, for whatever reason) for each GPU. The library methods should deal with pushing and popping their contexts.

The first problem (which is to say, the one for which I've currently got a minimal test case :)) is that I can't figure out how to get cleanup at the end of the program to work properly.

Here's minimal test case 1:
----------
#!/usr/bin/python

import pycuda.driver as cuda
cuda.init()
print "Initialized CUDA"

class gpuObject:
  def __init__(self,deviceID=0):
      self.device = cuda.Device(deviceID)
      self.context = self.device.make_context(cuda.ctx_flags.SCHED_YIELD)

  def method1(self):
      self.context.push()
      # Do some work...
      self.context.pop()

def main():
  gpu0 = gpuObject(0)
  gpu1 = gpuObject(1)

if __name__ == "__main__":
  main()
---------

If I run this program, at the end I get a warning about destroying an active context (fair enough, I never popped it), and then a crash:

Initialized CUDA
-----------------------------------------------------------
PyCUDA WARNING: I'm being asked to destroy a
context that's part of the current context stack.
-----------------------------------------------------------
I will pick the next lower active context from the
context stack. Since this choice is happening
at an unspecified point in time, your code
may be making false assumptions about which
context is active at what point.
Call Context.pop() to avoid this warning.
-----------------------------------------------------------
If Python is terminating abnormally (eg. exiting upon an
unhandled exception), you may ignore this.
-----------------------------------------------------------
terminate called after throwing an instance of 'cuda::error'
what():  cuCtxPushCurrent failed: invalid value
Aborted

I then tried adding a __del__ method to the gpuObject:

def __del__(self):
  self.context.pop()
  return

This leads to a segfault. Here's the output (along with some lines from instrumentation I added to src/cpp/cuda.hpp to try to help debug this):

Initialized CUDA
CUDA.HPP: popped context, new use count = 0
CUDA.HPP: Acquired context weak-pointer OK in context::current_context(), returning CUDA.HPP: Acquired context weak-pointer OK in context::current_context(), returning CUDA.HPP: Weak-pointer invalid (.expired() == true) in context::current_context(), popping and retrying CUDA.HPP: Context stack was empty in context::current_context(), returning null shared_ptr CUDA.HPP: Context stack was empty in context::current_context(), returning null shared_ptr
Segmentation fault


Any ideas what's going on here? The two deviceIDs can be the same or different; it makes no difference. I have some other context management related errors (e.g., no context being active immediately after a context.push()) but I'm still trying to work out a minimal test case for them, and they may be related to this.

Thanks for an awesome library!

Imran

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Reply via email to