On 14 September 2017 at 11:44, Eric Snow <ericsnowcurren...@gmail.com> wrote: > I've updated PEP 554 in response to feedback. (thanks all!) There > are a few unresolved points (some of them added to the Open Questions > section), but the current PEP has changed enough that I wanted to get > it out there first. > > Notably changed: > > * the API relative to object passing has changed somewhat drastically > (hopefully simpler and easier to understand), replacing "FIFO" with > "channel" > * added an examples section > * added an open questions section > * added a rejected ideas section > * added more items to the deferred functionality section > * the rationale section has moved down below the examples > > Please let me know what you think. I'm especially interested in > feedback about the channels. Thanks!
I like the new pipe-like channels API more than the previous named FIFO approach :) > send(obj): > > Send the object to the receiving end of the channel. Wait until > the object is received. If the channel does not support the > object then TypeError is raised. Currently only bytes are > supported. If the channel has been closed then EOFError is > raised. I still expect any form of object sharing to hinder your per-interpreter GIL efforts, so restricting the initial implementation to memoryview-only seems more future-proof to me. > Pre-populate an interpreter > --------------------------- > > :: > > interp = interpreters.create() > interp.run("""if True: > import some_lib > import an_expensive_module > some_lib.set_up() > """) > wait_for_request() > interp.run("""if True: > some_lib.handle_request() > """) I find the "if True:"'s sprinkled through the examples distracting, so I'd prefer either: 1. Using textwrap.dedent; or 2. Assigning the code to a module level attribute :: interp = interpreters.create() setup_code = """\ import some_lib import an_expensive_module some_lib.set_up() """ interp.run(setup_code) wait_for_request() handler_code = """\ some_lib.handle_request() """ interp.run(handler_code) > Handling an exception > --------------------- > > :: > > interp = interpreters.create() > try: > interp.run("""if True: > raise KeyError > """) > except KeyError: > print("got the error from the subinterpreter") As with the message passing through channels, I think you'll really want to minimise any kind of implicit object sharing that may interfere with future efforts to make the GIL truly an *interpreter* lock, rather than the global process lock that it is currently. One possible way to approach that would be to make the low level run() API a more Go-style API rather than a Python-style one, and have it return a (result, err) 2-tuple. "err.raise()" would then translate the foreign interpreter's exception into a local interpreter exception, but the *traceback* for that exception would be entirely within the current interpreter. > About Subinterpreters > ===================== > > Shared data > ----------- > > Subinterpreters are inherently isolated (with caveats explained below), > in contrast to threads. This enables `a different concurrency model > <Concurrency_>`_ than is currently readily available in Python. > `Communicating Sequential Processes`_ (CSP) is the prime example. > > A key component of this approach to concurrency is message passing. So > providing a message/object passing mechanism alongside ``Interpreter`` > is a fundamental requirement. This proposal includes a basic mechanism > upon which more complex machinery may be built. That basic mechanism > draws inspiration from pipes, queues, and CSP's channels. [fifo]_ > > The key challenge here is that sharing objects between interpreters > faces complexity due in part to CPython's current memory model. > Furthermore, in this class of concurrency, the ideal is that objects > only exist in one interpreter at a time. However, this is not practical > for Python so we initially constrain supported objects to ``bytes``. > There are a number of strategies we may pursue in the future to expand > supported objects and object sharing strategies. > > Note that the complexity of object sharing increases as subinterpreters > become more isolated, e.g. after GIL removal. So the mechanism for > message passing needs to be carefully considered. Keeping the API > minimal and initially restricting the supported types helps us avoid > further exposing any underlying complexity to Python users. > > To make this work, the mutable shared state will be managed by the > Python runtime, not by any of the interpreters. Initially we will > support only one type of objects for shared state: the channels provided > by ``create_channel()``. Channels, in turn, will carefully manage > passing objects between interpreters. Interpreters themselves will also need to be shared objects, as: - they all have access to "interpreters.list_all()" - when we do "interpreters.create_interpreter()", the calling interpreter gets a reference to itself via "interpreters.get_current()" (These shared objects are what I suspect you may end up needing a process global read/write lock to manage, by the way - I think it would be great if you can figure out a way to avoid that, it's just not entirely clear to me what that might look like. I do think you're on the right track by prohibiting the destruction of an interpreter that's currently running, and the destruction of channels that are currently still associated with an interpreter) > Interpreter Isolation > --------------------- > This sections is a really nice addition :) > Existing Usage > -------------- > > Subinterpreters are not a widely used feature. In fact, the only > documented case of wide-spread usage is > `mod_wsgi <https://github.com/GrahamDumpleton/mod_wsgi>`_. On the one > hand, this case provides confidence that existing subinterpreter support > is relatively stable. On the other hand, there isn't much of a sample > size from which to judge the utility of the feature. Nathaniel pointed out that JEP embeds CPython subinterpreters inside the JVM similar to the way that mod_wsgi embeds them inside Apache httpd: https://github.com/ninia/jep/wiki/How-Jep-Works > Open Questions > ============== > > Leaking exceptions across interpreters > -------------------------------------- > > As currently proposed, uncaught exceptions from ``run()`` propagate > to the frame that called it. However, this means that exception > objects are leaking across the inter-interpreter boundary. Likewise, > the frames in the traceback potentially leak. > > While that might not be a problem currently, it would be a problem once > interpreters get better isolation relative to memory management (which > is necessary to stop sharing the GIL between interpreters). So the > semantics of how the exceptions propagate needs to be resolved. As noted above, I think you *really* want to avoid leaking exceptions in the initial implementation. A non-exception-based error signaling mechanism would be one way to do that, similar to how the low-level subprocess APIs actually report the return code, which higher level APIs then turn into an exception. resp.raise_for_status() does something similar for HTTP responses in the requests API. > Initial support for buffers in channels > --------------------------------------- > > An alternative to support for bytes in channels in support for > read-only buffers (the PEP 3119 kind). Then ``recv()`` would return > a memoryview to expose the buffer in a zero-copy way. This is similar > to what ``multiprocessing.Connection`` supports. [mp-conn] > > Switching to such an approach would help resolve questions of how > passing bytes through channels will work once we isolate memory > management in interpreters. Exactly :) > Reseting __main__ > ----------------- > > As proposed, every call to ``Interpreter.run()`` will execute in the > namespace of the interpreter's existing ``__main__`` module. This means > that data persists there between ``run()`` calls. Sometimes this isn't > desireable and you want to execute in a fresh ``__main__``. Also, > you don't necessarily want to leak objects there that you aren't using > any more. > > Solutions include: > > * a ``create()`` arg to indicate resetting ``__main__`` after each > ``run`` call > * an ``Interpreter.reset_main`` flag to support opting in or out > after the fact > * an ``Interpreter.reset_main()`` method to opt in when desired > > This isn't a critical feature initially. It can wait until later > if desirable. I was going to note that you can already do this: interp.run("globals().clear()") However, that turns out to clear *too* much, since it also clobbers all the __dunder__ attributes that the interpreter needs in a code execution environment. Either way, if you added this, I think it would make more sense as an "importlib.util.reset_globals()" operation, rather than have it be something specific to subinterpreters. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com