Hi Nick, As far as I understand, the (to me) essential difference between your approach and my proposal is that:
Approach 1 (PEP-489): * Single (global) GIL. * PyObject's may be shared across interpreters (zero-copy transfer) Approach 2 (mine) * Per-interpreter GIL. * PyObject's must be copied across interpreters. To me, the per-interpreter GIL is the essential "target" I am aiming for, and I am willing to sacrifice the zero-copy for that. If the GIL is still shared then I don't see much advantage of this approach over just using the "threading" module with a single interpreter. (I realize it still gives you some isolation between interpreters. To me personally this is not very interesting, but this may be myopic.) > For the time being though, a single GIL remains > much easier to manage. "For the time being" suggests that you are intending approach 1 to be ultimately a stepping stone to something similar to approach 2? > Yes, something like Rust's ownership model is the gist of what we had > in mind (i.e. allowing zero-copy transfer of ownership between > subinterpreters, but only the owning interpreter is allowed to do > anything else with the object). This can be emulated in approach 2 by creating a wrapper C-level type which contains a PyObject and its corresponding interpreter. So that interpreter A can reference an object in interpreter B. >> 3. Practically all existing APIs, including Py_INCREF and Py_DECREF, >> need to get an additional explicit interpreter argument. >> I imagine that we would have a new prefix, say MPy_, because the >> existing APIs must be left for backward compatibility. > > This isn't necessary, as the active interpreter is already tracked as > part of the thread local state (otherwise mod_wsgi et al wouldn't work > at all). I realize that it is possible to that it that way. However this has some disadvantages: * The interpreter becomes tied to a thread, or you need to have some way to switch interpeters on a thread. (Which makes your code look like OpenGL code;-) ) * Once you are going to write code which manipulates objects in multiple interpreters (e.g. my proposed copy function or the "foreign interpreter wrapper" I discussed above) making the interpreter explicit probably avoids headaches. * Explicit is better than implicit, as somebody once said. ;-) Stephan 2017-05-26 15:17 GMT+02:00 Nick Coghlan <ncogh...@gmail.com>: > On 26 May 2017 at 22:08, Stephan Houben <stephan...@gmail.com> wrote: >> Hi all, >> >> Personally I feel that the current subinterpreter support falls short >> in the sense that it still requires >> a single GIL across interpreters. >> >> If interpreters would have their own individual GIL, >> we could have true shared-nothing multi-threaded support similar to >> Javascript's "Web Workers". >> >> Here is a point-wise overview of what I am imagining. >> I realize the following is very ambitious, but I would like to bring >> it to your consideration. >> >> 1. Multiple interpreters can be instantiated, each of which is >> completely independent. >> To this end, all global interpreter state needs to go into an >> interpreter strucutre, including the GIL >> (which becomes per-interpreter) >> Interpreters share no state whatsoever. > > There'd still be true process global state (i.e. anything managed by > the C runtime), so this would be a tiered setup with a read/write GIL > and multiple SILs. For the time being though, a single GIL remains > much easier to manage. > >> 2. PyObject's are tied to a particular interpreter and cannot be >> shared between interpreters. >> (This is because each interpreter now has its own GIL.) >> I imagine a special debug build would actually store the >> interpreter pointer in the PyObject and would assert everywhere >> that the PyObject is only manipulated by its owning interpreter. > > Yes, something like Rust's ownership model is the gist of what we had > in mind (i.e. allowing zero-copy transfer of ownership between > subinterpreters, but only the owning interpreter is allowed to do > anything else with the object). > >> 3. Practically all existing APIs, including Py_INCREF and Py_DECREF, >> need to get an additional explicit interpreter argument. >> I imagine that we would have a new prefix, say MPy_, because the >> existing APIs must be left for backward compatibility. > > This isn't necessary, as the active interpreter is already tracked as > part of the thread local state (otherwise mod_wsgi et al wouldn't work > at all). > >> 4. At most one interpreter can be designated the "main" interpreter. >> This is for backward compatibility of existing extension modules ONLY. >> All the existing Py_* APIs operate implicitly on this main interpreter. > > Yep, this is part of the concept. The PEP 432 draft has more details > on that: > https://www.python.org/dev/peps/pep-0432/#interpreter-initialization-phases > >> 5. Extension modules need to explicitly advertise multiple interpreter >> support. >> If they don't, they can only be imported in the main interpreter. >> However, in that case they can safely use the existing Py_ APIs. > > This is the direction we started moving the with multi-phase > initialisation PEP for extension modules: > https://www.python.org/dev/peps/pep-0489/ > > As Petr noted, the main missing piece there now is the fact that > object methods (as opposed to module level functions) implemented in C > currently don't have ready access to the module level state for the > modules where they're defined. > >> 6. Since PyObject's cannot be shared across interpreters, there needs to be >> an >> explicit function which takes a PyObject in interpreter A and constructs >> a >> similar object in interpreter B. >> >> Conceptually this would be equivalent to pickling in A and >> unpickling in B, but presumably more efficient. >> It would use the copyreg registry in a similar way to pickle. > > This would be an ownership transfer rather than a copy (which carries > the implication that all the subinterpreters would still need to share > a common memory allocator) > >> 7. Extension modules would also be able to register their function >> for copying custom types across interpreters . >> That would allow extension modules to provide custom types where >> the underlying C object is in fact not copied >> but shared between interpreters. >> I would imagine we would have a"shared memory" memoryview object >> and also Mutex and other locking constructs which would work >> across interpreters. > > We generally don't expect this to be needed given an ownership focused > approach. Instead, the focus would be on enabling efficient channel > based communication models that are cost-prohibitive when object > serialisation is involved. > >> 8. Finally, the main application: functionality similar to the current >> `multiprocessing' module, but with >> multiple interpreters on multiple threads in a single process. >> This would presumably be more efficient than `multiprocessing' and >> also allow extra functionality, since the underlying C objects >> can in fact be shared. >> (Imagine two interpreters operating in parallel on a single OpenCL >> context.) > > We're not sure how feasible it will be to enable this in general, but > even without it, zero-copy ownership transfers enable a *lot* of > interest concurrency models that Python doesn't currently offer great > primitives to support (they're mainly a matter of using threads in > certain ways, which means they not only run afoul of the GIL, but you > also don't get any assistance from the interpreter in strictly > enforcing object ownership rules). > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/