[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)

Eric V. Smith Wed, 17 Jun 2020 11:33:16 -0700

On 6/17/2020 12:07 PM, Jeff Allen wrote:

On 12/06/2020 12:55, Eric V. Smith wrote:
On 6/11/2020 6:59 AM, Mark Shannon wrote:
Different interpreters need to operate in their own isolated addressspace, or there will be horrible race conditions.
Regardless of whether that separation is done in software or hardware,
it has to be done.
I realize this is true now, but why must it always be true? Can't wefix this? At least one solution has been proposed: passing around apointer to the current interpreter. I realize there issues here, likecallbacks and signals that will need to be worked out. But I don'tthink it's axiomatically true that we'll always have race conditionswith multiple interpreters in the same address space.
Eric
Axiomatically? No, but let me rise to the challenge.
If (1) interpreters manage the life-cycle of objects, and (2) a racecondition arises when the life-cycle or state of an object is accessedby the interpreter that did not create it, and (3) an object willsometimes be passed to an interpreter that did not create it, and (4)an interpreter with a reference to an object will sometimes access itslife-cycle or state, then (5) a race condition will sometimes arise.This seems to be true (as a deduction) if all the premises hold.
(1) and (2) are true in CPython as we know it. (3) is prevented(completely?) by the Python API, but not at all by the C API. (4) isimplicit in an interpreter having access to an object, the way CPythonand its extensions are written, so (5) follows in the case that the CAPI is used. You could change (1) and/or (2), maybe (4).

I'm assuming that passing an object between interpreters would not besupported. It would require that the object somehow be marshalledbetween interpreters, so that no object would be operated on outside theinterpreter that created it. So 2-5 couldn't happen in valid code.

"Passing around a pointer to the current interpreter" sounds like anattempt to break (2) or maybe (4). But I don't understand "current".What you need at any time is the interpreter (state and life-cyclemanager) for the object you're about to handle, so that the receivinginterpreter can delegate the action, instead of crashing ahead itself.This suggests a reference to the interpreter must be embedded in eachobject, but it could be implicit in the memory address.

Sorry for being loose with terms. If I want to create an interpreter andexecute it, then I'd allocate and initialize an interpreter stateobject, then call it, passing the interpreter state object in towhatever Python functions I want to call. They would in turn pass thatpointer to whatever they call, or access the state through it directly.That pointer is the "current interpreter".

There is then still an issue that the owning interpreter has to bethread-safe (if there are threads) in the sense that it can serialiseaccess to object state or life-cycle. If serialisation is by a GIL,the receiving interpreter must take the GIL of the owning interpreter,and we are somewhat back where we started. Note that the "currentinterpreter" is not a function of the current thread (or vice-versa).The current thread is running in both interpreters, and by hypothesis,so are the competing threads.

Agreed that an interpreter shouldn't belong to a thread, but since aninterpreter couldn't access objects of another interpreter, there'd beno need for cross-intepreter locking. There would be a GIL perinterpreter, protecting access to that interpreter's state.

Can I just point out that, while most of this argument concerns aparticular implementation, we have a reason in Python (the language)for an interpreter construct: it holds the current module context, sothat whenever code is executing, we can give definite meaning to the'import' statement. Here "current interpreter" does have a meaning,and I suggest it needs to be made a property of every function objectas it is defined, and picked up when the execution frame is created.This *may* help with the other, internal, use of interpreter, forlife-cycle and state management, because it provides a recognisablepoint (function call) where one may police object ownership, but thatisn't why you need it.

There's a lot of state per interpreter, including the module state. See"struct _is" in Include/internal/pycore_interp.h.


Eric

Jeff Allen



_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/GACVQJNCZLT4P3YX5IISRBOQTXXTJVMB/
Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/NFSF5MC6WMGPDL77AEALKOCYN4MPDWV4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)

Reply via email to