[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

Guido van Rossum Wed, 06 May 2020 08:02:38 -0700

Okay, an image is appearing. It sounds like GIL-free subinterpreters may
one day shine because IPC is faster and simpler within one process than
between multiple processes. This is not exactly what I got from PEP 554 but
it is sufficient for me to have confidence in the project.


On Wed, May 6, 2020 at 5:41 AM Victor Stinner <[email protected]> wrote:

> Hi Nathaniel,
>
> Le mer. 6 mai 2020 à 04:00, Nathaniel Smith <[email protected]> a écrit :
> > As far as I understand it, the subinterpreter folks have given up on
> > optimized passing of objects, and are only hoping to do optimized
> > (zero-copy) passing of raw memory buffers.
>
> I think that you misunderstood the PEP 554. It's a bare minimum API,
> and the idea is to *extend* it later to have an efficient
> implementation of "shared objects".
>
> --
>
> IMO it should easy to share *data* (object "content") between
> subinterpreters, but each interpreter should have its own PyObject
> which exposes the data at the Python level. See the PyObject has a
> proxy to data.
>
> It would badly hurt performance if a PyObject is shared by two
> interpreters: it would require locking or atomic variables for
> PyObject members and PyGC_Head members.
>
> It seems like right now, the PEP 554 doesn't support sharing data, so
> it should still be designed and implemented later.
>
> Who owns the data? When can we release memory? Which interpreter
> releases the memory? I read somewhere that data is owned by the
> interpreter which allocates the memory, and its memory would be
> released in the same interpreter.
>
> How do we track data lifetime? I imagine a reference counter. When it
> reaches zero, the interpreter which allocates the data can release it
> "later" (it doesn't have to be done "immediately").
>
> How to lock the whole data or a portion of data to prevent data races?
> If data doesn't contain any PyObject, it may be safe to allow
> concurrent writes, but readers should be prepared for inconsistencies
> depending on the access pattern. If two interpreters access separated
> parts of the data, we may allow lock-free access.
>
> I don't think that we have to reinvent the wheel. threading,
> multiprocessing and asyncio already designed such APIs. We should to
> design similar APIs and even simply reuse code.
>
> My hope is that "synchronization" (in general, locks in specific) will
> be more efficient in the same process, than synchronization between
> multiple processes.
>
> --
>
> I would be interested to have a generic implementation of "remote
> object": a empty proxy object which forward all operations to a
> different interpreter. It will likely be inefficient, but it may be
> convenient for a start. If a method returns an object, a new proxy
> should be created. Simple scalar types like int and short strings may
> be serialized (copied).
>
> Victor
> --
> Night gathers, and now my watch begins. It shall not end until my death.
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>

_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/U57XTLNEMV5SNL34UEHKEEWKSADASIMS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

Reply via email to