Alex Park <a...@nervanasys.com> writes: > Hi, > > I'm trying to use multiple gpus with mpi and ipc handles instead of the > built-in mpi primitives to p2p communication. > > I think I'm not quite understanding how contexts should be managed. For > example, I have two versions of a toy example to try out accessing data > between nodes via ipc handle. Both seem to work, in the sense that process > 1 can 'see' the data from process 0, but the first version completes > without any error, while the second version generates the following error: > > PyCUDA WARNING: a clean-up operation failed (dead context maybe?) > > > │··········· > cuMemFree failed: invalid value > > The two versions are attached below. Would appreciate any insight as to > what I'm doing wrong.
Context management in CUDA is a bit of a mess. In particular, resources in a given context cannot be freed if that context isn't (or can't be made) the active context in the current thread. Your Context.pop atexit makes sure that PyCUDA can select contexts when it tries to do clean-up, which may (because of MPI) run in a different thread than the one that does the bulk of the work. That's my best guess as to what's going on. tl;dr: The Context.pop() is the main difference. Andreas
signature.asc
Description: PGP signature
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda